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Abstract 

This thesis studies how to prove equivalences between programs in languages 
with selective strictness, specifically, we use a restricted core lazy functional lan- 
guage with a selective strictness operator seq. We establish some of the first 
ever equivalences between lazy programs with selective strictness by manipulat- 
ing operational semantics derivations. Our operational semantics is similar to 
that used by van Eekelen and De Mol, though we introduce a 'garbage-collecting' 
rule for (let) which turns out to cause expressiveness restrictions. For example, 
arguably reasonable lazy programs such as let y = Xz.z in Xx.y do not reduce in 
our operational semantics. 

We prove some properties of seq, including associativity, idempotence, and 
left-commutativity. The proofs use our three notions of program equivalence de- 
fined on a big step operational semantics for a core functional language with 
selective strictness. We also show that, our garbage-collecting (let) rule causes 
the apparently weaker ~„ to coincide with ~ s . Left-commutativity is particu- 
larly difficult to prove as it is necessary to consider transforms of derivations 
which make subderivations larger, rather than smaller, and hence induction on 
derivations or the size of derivations fails. 

To prove our results we make some new fundamental observations about our 
operational semantics and a theory describing a dependency relation between 
the evaluation of expressions bound to variables in heaps; there follows a notion 
of bisimilarity on heaps. These notions are general and are not restricted to 
selective strictness. We consider a measure of complexity of a derivation which 
does not have to do with its size; this is diff(JI). This is deceptively simple, 
being just the variables in the heap whose bindings are changed by the evaluation. 
However, using the dependency relation, we can reason by induction on the size 
of diff(H). It turns out that this quantity can be conveniently 'made smaller', 
even if subderivations of II become larger, and thus we obtain a useful inductive 
quantity. 
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Chapter 

Introduction 



Introducing limited, or selective, strictness into an otherwise lazy language can 
greatly improve performance. Selective strictness can improve the time or space 
efficiency of data structures, for example the unboxed arrays of Haskell and 
Clean improve both time and space performance. Similarly, selective strictness 
can improve the time or space efficiency of computations. Folklore abounds with 
stories of solving space leaks by adding sufficient strictness to release large data 
structures early in the computation. Finally, selective strictness can be used to 
control evaluation order; this is especially important for parallel evaluation. For 
instance, the parallel language Eden [ )] extends Haskell with mechanisms for 
eager process creation and eager value communication. 

Selective strictness is introduced using a variety of mechanisms including spe- 
cial data structures like unboxed arrays, type annotations, and language con- 
structs. Indeed, many languages offer more than one selective strictness mech- 
anism; both Haskell and Clean offer all three of the mechanisms mentioned 
above. In this thesis, we focus on one selective strictness language construct, the 
seq in Haskell, which also corresponds closely to the let! in Clean. 

Selective strictness is useful, and some would argue that it is essential, for 
effective software development in lazy languages. However, it is also true that 
unconsidered use of strictness can degrade program time or space performance, 
and even cause a previously terminating program to diverge. Programmers are 
warned of these hazards, see for example [ 7 7, Section 6.2]. So, we need to be able 
to reason about this powerful but dangerous construct. 

1.1 Contributions 

The purpose of this thesis is therefore to contribute to the relatively limited body 
of work on reasoning about selective strictness. The research contributions of the 
work are as follows: 

• We establish some of the first ever equivalences between programs with 
selective strictness. We do this by manipulating operational semantics 
derivations, in contrast to earlier related works which use a denotational 
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approach [4i'] or a theorems-for-free one [ ]. Our operational semantics is 
similar to that used by van Eekelen and De Mol[105], though we introduce 
a 'garbage-collecting' rule for let. 

This garbage-collection restricts the expressiveness of our system, and hence 
there are some arguably reasonable lazy programs that do not reduce in our 
operational semantics. As will be demonstrated in Remark 3.5, an example 
of such programs is let y = Xz.z in Xx.y. In Section 11.3, we illustrate how, 
as another example, let X\ = Xx.x in ((let x 2 = Xx.x in Xx.x 2 ) X\) does 
not reduce in our system. Despite such restrictions, and apart from the 
technical benefits garbage-collection brings to our proofs (in Theorem 8.4 
and Corollary 8.6 for example), as discussed in Section 11.3, we still believe 
that garbage-collection is necessary for (let) rules in lazy evaluation. (One 
can also consult Remark 8.13.) How to retain garbage-collecting (let) rules 
without restricting the expressiveness is future work. 

We prove some elementary properties of seq, including associativity (The- 
orem 5.1, strengthened in Theorem 10.2) and a weak form of idempotence 
(Theorem 5.5) in Chapter 5. The proofs use our three notions of program 
equivalence (Chapter 4) defined on a big step operational semantics for a 
core functional language with selective strictness (Chapter 3). 

We call these properties 'elementary', not necessarily because they are triv- 
ial, but because their proofs are by fairly straightforward transformations 
on derivations in the operational semantics, and by structural inductions 
on those derivations. 

We prove left-commutativity and idempotence of seq (Theorems 9.3 and 10.2). 
We also show that, our garbage-collecting (let) rule will cause the appar- 
ently weaker ~„ coincides with ~. s (Section 8.2). 

Here, the proofs are harder. The underlying reason is that it is necessary to 
consider transforms of derivations which make subderivations larger, rather 
than smaller; induction on derivations or the size of derivations fails. 

To prove our results, we make some new fundamental observations about 
our operational semantics (Chapter 6) and a theory describing a depen- 
dency relation between the evaluation of expressions bound to variables 
in heaps (Chapter 7); there follows a notion of equivalence on heaps: « 
(Chapter 8). These notions are general and are not restricted to selective 
strictness. However, it happens that, for our specific operational semantics 
(Definition 3.4), sa is a bisimilarity relation (Section 8.3). In future work, 
we hope to apply these notions to reasoning about call-by-need evaluation 
in general, and to parallel evaluation. 

We consider a measure of complexity of a derivation which does not have 
to do with its size; this is diff (II) (Definition 6.4). This is deceptively 
simple, being just the variables in the heap whose bindings are changed 
by the evaluation. However, using the dependency relation, we can prove 
three central technical results: Theorems 7.10, 7.11, and 7.12. These results 
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together allow us to reason by induction on the size of diff(JI). It turns out 
that this quantity can be conveniently 'made smaller', even if subderivations 
of II become larger, and thus we obtain a useful inductive quantity. The 
reader can see how this inductive proof-method is applied, in Theorem 8.4, 
Lemma 9.1, and Theorem 10.1. 

• In the course of proving our results we will come across other more technical 
theorems, which we need for some specific purpose but which are attractive 
in themselves. Examples are Determinism (Theorem 3.9), Theorem 4.6, 
Theorem 6.9, and Theorem 6.23. 

For the reader's convenience, we open each chapter with a technical outline of 
the material covered in it. 



1.2 Authorship 

Chapters 3 to 10 of this thesis are substantially based on joint work with Drs 
Gabbay, Trinder and Prof Ortega-Mallen [71]. The author's specific contributions 
to the research are as follows: 

• To contribute to the formulation of [71], and to check many of the proof 
sketches in it. 

• To prove a total of 38 theorems, lemmata, and corollaries. More specifically: 
The author replaced the proof sketches from [71] with complete proofs for 
15 theorems, lemmata, and corollaries. He also proved 8 lemmas for which 
there was no proof in [71]. And, finally, the above 23 results entailed 15 
additional lemmata and corollaries all of which the student proved. 

• To formulate the notions of equivalence in Chapter 4. 

• To prove the Associativity and Idempotence of seq in Chapter 5. 

• To contribute to the development of Analogous Heaps («) in Chapter 8, 
conjecture that it gives rise to a bisimilarity, formulate Definition 8.16 with 
a hint from Prof Ortega-Mallen, and to prove this conjecture (Section 8.3). 

• To contribute in the development of induction on the size of diff (H). This 
is through suggesting earlier formulations for diff (U) and refuting all the 
similar earlier strategies for proving Left-commutativity (Theorem 9.3). 
It is also noteworthy that it was the author who first suggested Left- 
Commutativity. 



1.3 Structure 

This thesis is structured as follows: we start by reviewing the related work in 
Chapter 2. Next, in Chapter 3, we present our operational semantics along with 
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some of its fundamental properties. We then define our notions of equivalence in 
Chapter 4 and make a few technical observations about them. Elementary prop- 
erties of seq are considered in Chapter 5. Based on these, in Chapter 6, we look 
at the semantics of the systems based on Launchbury's operational semantics. 
Atomic variables and how they relate to the internals of heaps are considered 
in Chapter 7. This leads us to the useful notion of analogous heaps which is 
studied in Chapter 8. Left-commutativity and idempotence of seq are proved in 
Chapters 9 and 10, respectively. In Chapter 11, we make concluding remarks and 
consider some possible future research directions. 



2 



Chapter 

Related Work 



So-called 'lazy' programming languages like Haskell and Clean in fact im- 
plement call-by-need evaluation as first proposed by Wadsworth [114]. This is 
normal order reduction to weak head normal form refined with subexpression 
sharing. Therefore, to reason about selective strictness in such languages, we 
must first formalise normal order evaluation. 

Section 2.1 covers work on call-by-need. Ong and Abramsky [73, 2, 3] studied 
this, but they did not capture sharing. Both [ ] and [ ] present a big-step 
semantics for call-by-need. Context-based representations are defined in [10, 7, 
62]. We choose to build on Launchbury's big step semantics [55] (Chapter 3). 
This models a restricted core call-by-need evaluation at a level of abstraction 
suitable for our needs; it is more abstract than lazy abstract machines, and more 
concrete than mostly denotational treatments like those of Abramsky and Ong. 

Before going to selective strictness, Section 2.2 discusses the change in seman- 
tics of seq over time. Section 2.2, furthermore, explains the current diversity of 
viewpoint between the campaigns in the Haskell community over the semantics 
of seq versus its other variation pseq. 

Section 2.3 reviews the work on selective strictness. In prior work other re- 
searchers have addressed the issues raised by selective strictness. Harrison et al. 
[37] provides a calculational approach, in which every detail of how to implement 
a Haskell interpreter in Haskell itself is explained. Later, [38] uses P-logic 
[52] to specify and verify properties of programs with selective strictness. In [106], 
van Eekelen and de Mol extend Launchbury's semantics to selective strictness. 
Other related works of this group are [ ] and [1 08]. We choose to directly rea- 
son on derivation trees of an operational semantics which is a variation of that of 
[106]. Johann and Voigtlander [47] study free theorems in presence of selective 
strictness. [48, 110, 111, 97, 98] all extend this work. 

In Section 2.4, we consider other related work. 

In Sections 2.1-2.4, works are categorised based on the authors. The categories 
are then ordered chronologically. Each section starts with an explanation on the 
common features of the reviewed work of the respective section. Section 2.1 ends 
in a comparison between the work of the section with that of ours. 
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Xx.P ^ Xx.P 

M $ Xx.P P[x := Q] $ N 

MQ|iV 



Figure 2.1: Operational Semantics of Abramsky and Ong 

2.1 Call-by-Need 

In this section, we explore the different work which present models or operational 
semantics for call-by-need. Ariola et al. [10, 7, 6 ] chose "call-by-need" to only 
describe calculi — as opposed to evaluation strategies. In their terminology, "lazy 
evaluation" is considered as an evaluation strategy for call-by-need. Unlike Ariola 
et al., we will not distinguish between call-by-need and lazy evaluation or even 
"laziness" . Different properties of each system will be discussed. At the end of the 
section the decision to chose Launchbury's operational semantics [55] is justified. 

Abramsky and Ong (1988, 1990, and 1993) 

Abramsky and Ong's motivation is to fill a gap between the existing theory at 
the time of their writing and the common practice of functional programming 
languages of the same time [2, 73, 3]. At the time the theory was powerful 
enough to give meaning to A-terms up to head-normal form (hnf). Whereas 
many functional programming languages of the time ceased evaluation when the 
terms are only in weak-head normal form (whnf). This practice is well-motivated 
in lazy evaluation, for which they offer a powerful theory. 

Ong [73] starts by introducing his operational semantics which comes in Fig- 
ure 2.1. Next, he acknowledges the definition of applicative bisimilarity (^ B ) 
from Abramsky as: 



M ~ B N O (M ~ N) A (N ~ M) where 

D D 

M ~ N & (M ty Xx.P => N fy Xx.Q A VR.P[x := R] ~ Q[x := R]). 



This definition gives rise to the XI = (A , C, =) inequational theory where l 

XEh M QN = M ~ B N 
A£ h M = N = M - B N. 



The rest [73] explores various properties of XL For example, Ong shows that 

~ is a Morris-style pre-congruence [70]. More notably, he shows that XI is not 
fully abstract with regard to D — the initial solution to the domain equation: 
D = [D —> D]±. Several extensions of X£ are also explored afterwards. 



1 A° denotes the collection of all closed A-terms. 
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The seminal work [! ] starts by constructing Abramsky's theory by the study 
of Applicative Transition Systems (ats). He offers a domain equation for ats'es 
next. This helps him to show a number of important results about their sys- 
tem such as computational adequacy. He also constructs a domain logic and 
shows how to employ this for the study of the equational theory over programs. 
His handy vehicle in [ ] is his notion of applicative simulation and bisimulation. 
This is essentially based on the same operational semantics as Figure 2.1. He 
shows that this notions coincide with that of traditional "Morris- Style" contex- 
tual equivalence [70, 12, 66, 86]. This work is also equipped with a variety of 
other technical side-products. For example, he shows that Q, = (Xx.xx)(\x.xx) 
is a least element of their simulation, and YK is a top element for that. 2 

The hnal work of Abramsky and Ong on lazy evaluation is [3] where all the 
proofs of their previous works are thoroughly filled. As shown in Figure 2.1, their 
work [2, 73, 3] fails to capture sharing — which is crucially important in the 
context of lazy evaluation. This is the case because in their study there is no 
book-keeping media; that is, either a store or a rule for appropriately updating 
let-bindings. (As a matter of fact, let-bindings are completely absent in the work 
of Abramsy and Ong.) An alternate point of view on this (due to [ ]) is that: 
Abramsky and Ong explored the model theoretic properties of the call-by-need 
calculus Plotkin offered along with his evaluator to A-abstractions [85]. 

Purushothaman and Seaman(1992, 1994, and 1996) 

The first work of this group [ ] presents the first operational semantics for 
lazy evaluation which does capture sharing. This is for a language called LAZY- 
PCF+Shar which extends the PCF syntax with explicit substitutions. Their 
operational semantics is based on judgements of the form <$c e, A ;»^^C e', A' S> 
where e 1 might be a simple expression or one with several layers of explicit substi- 
tutions (to represent let-bindings). Purushothaman and Seaman establish sound- 
ness and computational adequacy with respect to a fixed-point semantics for PCF 
[86]. Whilst proof sketches are provided in this work, the details are left to [88]. 

In their operational semantics, As are lists of bindings of the form X{ \— > e^ 
where FV(e-i) C {xj}f +l . In addition to the fact that working with lists rather 
than sets makes mutual recursion problematic, it imposes several other difficulties 
in working with their system. For example, variable evaluations need an ordered 
list traversal until one reaches to the actual variable to be evaluated. (This is 
Var2 and Varl in Figure 2.2, respectively.) Another example of such difficulties 
is the necessity of a lemma which proves that the order of certain bindings can be 
changed under right conditions (Lemma 4.1 in [89]). Finally, this restriction has 
also unnaturally affected their Rec rule (Figure 2.2): each unfolding of a recursive 
expression will produce a fresh binding. 

On the other hand, their operational semantics is typed. Although they em- 
ploy the type information to prove certain desirable well-formedness properties 
of their system, the necessity of typing remains questionable. Testimony to this 



2 One can think of YK as the infinite process solving the equation ip = Xx.tp. 
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Varl: 



«; e, A »^<<C N,A' » 



« 2:, [s : f h e].A »^«C iV, [1 : t ^ AT] .A' » 



« y, A :»-►«: AT, A' » 

Var2: j/^x 

« ?/, [1 : t w e].A ;»—><§; N, [x : t i-> e].A' ;» 

«; e b A »->«: AT, A' » <<c Ap(AT, e 2 ), A' »->«: AT', A" » 

Appl: 

«: e 1 e 2 , A »->«: A/"', A" » 

^C (e[ra/z], [nx : s ^ /ii : £.e]), A »—><<;: A", A' » 



Rec: 



«; ^x : t.e, A »— >«; AT, A' » 



y4p(Ax : t.eo, e) = (eo[nx/x], [nx : t 1 — ^ e]) 

Ap((N, [x : 1 1-> ei]), e) = ((if, [i:tH e^), [n : s h^ e]) 
where Ap(N, e) = (K, [n : s 1— > e]). 



Note: ?^x denotes a new variable. 



Figure 2.2: Varl, Var2, Appl, and Rec of [89] 

is that, in most of their proofs, the typing information is fully dismissed. Finally, 
the fact that, unlike Launchbury [ ], they do not normalise expressions prior to 
evaluation makes their Appl rule considerably complicated. See Figure 2.2. 

In their next work [95], Seaman and Purushothaman change a few minor 
properties of their previous system [89]. As they put it, their motivation is to 
provide a formalism for lazy evaluation which both captures sharing and is useful 
for reasonings. Examples of reasoning that they mention include compile-time 
analysis such as garbage-collection, order of evaluation, and update-in-place [12, 
26, 35]. They argue that because the sharing behaviour will not be present in a 
denotational model a denotational semantics is not a suitable basis for a proof of 
correctness of such an analysis. With this in mind, they change some aspects of 
lazy-PCF + Shar in the ways that will be explained later. 

Unlike their previous work [ ], this time, they prove the computational cor- 
rectness of their semantics by showing that it is equivalent to a call-by-name 
operational semantics found in [ ]. In order to do this, they offer an inter- 
mediate semantics in which only all the typing information are stripped. This 
puts yet more question on the necessity of type information in their operational 
semantics. They also prove some nice properties of their operational semantics 
which are present in our system. This includes: Determinism (Thoerem 3.9), 
Lemma 5.3, Lemma 5.4, and a weaker form of Lemma 3.10. 

The judgments of Seaman and Purushothaman in [95] are of the form: (e, A) J, 
(e',A'). Below is a discussion on the changes of this operational semantics over 
[89]. More details on the exact rules comes in Figure 2.3: 
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(ei, A) | (Xx : s.e, A') (e[nx/x], [nx : s i-> e 2 ]A') | (e', A") 
{Appl} 



{Rec} 



<( ei e 2 ),,4> j( e U"> 
(e[nx/x], [nx : s i— ► /zx : t.e]A) j (e', A') 
(fix :t.e,A) | (e',A') 



Figure 2.3: The {Appl} and {Rec} of [95] 

1. They remove the explicit substitutions from their syntax; they handle sub- 
stitutions in their operational semantics only. As they explain explicit sub- 
stitutions, per se, are not suitable for modelling sharing in its lazy evaluation 
fashion. Therefore, they decide to dismiss this in their syntax. However, 
despite the removal of explicit substitutions from their syntax, they still do 
not add any syntax for let-binding. 

2. They update the Appl rule of Figure 2.2 to {Appl} rule of Figure 2.3. 
Despite the change — which simply accommodates removal of explicit sub- 
stitutions — not normalising arguments prior to evaluation causes com- 
plications in the operational semantics: they still need to augment their 
environment on the spot. That is replacing A' by [nx : s i— > e 2 ]A' (list 
concatenation). 

3. Similarly, the removal of explicit substitutions from their syntax demands 
a new rule for recursions (Rec in Figure 2.2). Making substitutions implicit 
does not remove their need for production of fresh bindings upon every 
unfolding of a recursive expression. This prevents their semantics from 
capturing sharing of recursive functions. 

The final work of this group [ ] is similar to [05] , with some minor changes 
which will be explained later. In [96], Seaman and Purushothaman report that 
their operational semantics is used as a basis for an analysis called reduction of 
variables [' ] which indicates whether the result of an evaluation is referred to 
by some variable that could be used later in the program. Like their previous 
operational semantics, this third semantics also suffers from addition of fresh 
bindings upon every unfolding of recursion. They suggest two ways to overcome 
this problem. Regardless of the fact that neither of their suggestions is shown 
to work, implementing either suggestion would dictate a drastic change to their 
entire machinery. (See the relevant discussion of Seaman and Purushothaman in 
Section 2.6 of [96].) 

The judgements of the operational semantics offered in [! ] are of the form: 
(e, A)z | (v, A')z> where the subscript is a list of fresh names, and v ranges over 
values. The major differences between this semantics and their previous ones are 
on the rules shown in Figure 2.4. Most notably, they merge {Varl} and {Var2} 
into {Var} to remove the need for list traversal for every variable evaluation. 
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(e,A) z l(v,A') 1 
{Var} 



{Appl} 



(x, A Q [x:t>-> e]A) z j (v, A [x : t ^ v]A') z , 
(ei, A) z | (Ax : s.e, A') Z:Z , (e[z/x], [z : s .-» e 2 ]A')z, | (u, A")z« 
<(eie 2 ),4) z j^A'V 
(e[z/x\, [z :t^ iix : t.e]A) z j (v, A') z > 



{Rec} 



(fix : £.e) 2:Z | (v,A')> 



Figure 2.4: The {Var}, {Appl}, and {Rec} of [96] 

Launchbury (1993) 

Launchbury's work [ ] is of an intermediate level of granularity, being at a higher 
level of abstraction than the variety of lazy abstract machines [49, 76, 27, 53, 
9], and lower level than the full-fledged theories [2, 73, 3]. He offers a natural 
semantics []7, 51] for lazy evaluation that does capture sharing. Because his 
natural semantics captures a range of commonalities between these machines, 
it offers a fertile framework for the study of different aspects of behaviours of 
lazy functional programmes. For example, Launchbury himself speaks of the 
possibility of expanding his work for taking "Garbage Collection" and "Cost 
of Computation" into account. Launchbury's work truly captures sharing, and 
provides proofs for the Soundness, Completeness, and Computational Adequacy 
of his heap-based model (with respect to the Denotational Semantics that he 
presents). 

Launchbury's big-step operational semantics is illustrated in Figure 2.5. (Note 
that in his semantics z ranges over values.) What perfectly fits this semantics for 
our application is that, in addition to a means for lazily evaluating the expressions, 
it tracks the effects of evaluations in their environments. The medium which 
utilises this is the notion of heaps which play a critical role in our work as well. 
As analysed earlier, the similar work which provide store-like notions (such as 
Purushothaman and Seaman [89, 95, i ]) do not completely dovetail into our 
purposes because they put constraints on the order of bindings. 

Ariola et al. (1995, 1997, and 1998) 

The hrst work of this group [10] aims to hnd a match between the operational 
semantics of A-calculus and the actual behaviour of lazy functional languages by 
the time [10] was written. They argue that the low-level nature of natural se- 
mantics such as [89, 55, 95, 96] permits neither a simple explanation of language 
implementations, nor source-level reasoning about program behaviour. Over the 
course of time, we now however know that this is not true. For example, the 
variety of interesting properties in [55], provides an elegant basis for explaining 



Chapter 2. Related Work 11 



T : Xx.e^T : Xx.e 
r : e J| A : Xy.e' A : e'[x/y] J| Q : z 
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Figure 2.5: Launchbury's Operational Semantics 

the implementation of languages such as GpH [103]. (See [36, 11] for more expla- 
nation on this.) Moreover, works such as [105, 106, 108] very well demonstrate 
the suitability of natural semantics for the source-level reasoning. This thesis also 
hopes to demonstrate this suitability. 

[10] starts by offering a calculus for call-by-need. This is A| et syntax and 
small-step operational step of which comes in Figure 2.6. Directly manipulat- 
ing let-bindings as opposed to dealing with stores (like heaps in [ ] or lists of 
bindings in [89, 95, 96]) is remarkable in A| et . As the authors themselves explain, 
working without a separate store is one their aims; they consider this a lift in the 
level of operational semantics for A-calculus. Ariola et al. show that A| et enjoys 
confluence, as well as equivalence to call-by-need. 3 Their proofs rely on a heavy 
use of directed acyclic graphs enriched with boxes and labelled edges [9]. We 
believe this could have been greatly simplified by not dismissing stores. However, 
this suggests that their approach gives a more appropriate means for reasoning 
about languages semantics of which rely on graph term rewriting. 

Generally, a deficiency of A| et is its treatment for recursive values. In order to 
accommodate this, Ariola et al. suggest restricting substitutions to only occur 
when a variable appears in the hole of an evaluation context. The result is A| 
— an operational semantics which looks like that of Launchbury [ ]. X( and 
Launchbury's operational semantics (Figure 2.5) are so similar that Ariola et al. 
also clarify it (Observation 8.2 in [10]): 

Proposition. ($)M JJ. (V)V iff X e h let $ in M i — > let * in V. 

need 

(In words, (<3>)Af JJ- (^)V can be read as: evaluating M in the store $ results 
in V with other effects reflected in \E>.) They claim that the correctness proof of 
Xe is possible via showing its soundness and completeness with respect to Ariola 
and Klop's call-by-name with cycles [8]. 



3 It is noteworthy that, as is traditional [S5], their notion of equivalence only cares about 
reducibility. This is certainly weaker than the notions of equivalence that we use in this thesis. 
Furthermore, they only test reducibility for closed terms, whereas there is no such restriction 
in our notions. See Chapter 4. 



Chapter 2. Related Work 



12 
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V,W ::= 
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V | let x = M in A 
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L,M,N ::= 


x | V | M N | let x = M in JV 


Axioms 








(let-l) 


{Xx.M] 
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= let x = M in A 


(let-V) 


let x = 


V in C[x] 


= let x = V in C[V] 


(let-C) 


(let x = 


= L in M) N 


= \etx = L \n M N 


(let-A) 


let y = 


(let x = L in M) 


in A^ = let x = L in let y = M in A 



Figure 2.6: A, et of [10] 
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deref 
lift 



assoc 



Figure 2.7: A 



need 



of 



I ] is the next work of this group where Ariola and Felleisen present a slightly 
different operational semantics; See X nee d in Figure 2.7. They prove that their 
calculus is sound and complete with respect to Plotkin's call-by-name calculus 
[85]. In addition, Ariola and Felleisen show that their X nee( i satisfies a Curry- Feys- 
style Standardisation Theorem. They also show that X nee( j is a strict sub-theory 
of call-by-name. Some basic properties of X nee d are also established: confluence 
and Huet's notion of absence of critical pairs [44]. 

This paper offers a thorough explanation about their design decisions in which 
they compare environment-based implementations of laziness with graph-based 
ones. | ] has an emphasis on source syntax over graphs: they again employ 
enriched directed acyclic graphs which comes handy in their completeness proof. 
They show that this can be adapted for full-laziness as well [114, 39, 104, 46, 7 ]. 
Over the course of time, we now know that Launchbury's semantics (Figure 2.5) 
can also be neatly adapted for complete laziness [100]. 

It is noteworthy that, like their previous work [10], [7] has no proper treatment 
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Figure 2.8: A NEED of [62] 

for recursive values. However, they report that they are in the process of finding 
one on the basis of [8, 9]. 

The third work of this group is [ :] in which they present yet another variation 
of their original work [10]. See A NEED in Figure 2.8. A NEED enjoys pretty much every 
major interesting property of X nee d [7]. For example, Maraist et al. prove its 
confluence, and equivalence with Plotkin's call-by-name [85]. They also provide 
a computable deterministic strategy for standard evaluation. Whilst the proofs 
are mostly provided, some details are still left to [61]. Finally, it is noteworthy 
that this work also fails to provide a proper treatment for recursion. 

Bastonero, Pravato, and Ronchi della Rocca (1997) 

[15] aims to model lazy evaluation. However, like the work of Abramsky and 
Ong [2, 73, 3], it fails to capture sharing. Therefore, from the point of view of 
7], Bastonero et al. provide another model for Plotkin's call-by-name A-calculus 
which also corresponds to his respective evaluator [85]. Their model is based on 
an extension of coherent spaces [54] called pointed coherent spaces [12]. 

Bastonero et al. are interested in models which are adequate with respect to 
the operational semantics shown in Figure 2.1. Therefore, they acknowledge that 
the models offered by Absramsky and Ong [2, 73, 3] as well as the one by Longo 
[59] — referred to as M. ao an d M-l, respectively — do fulfill this. On the other 
hand, Bastonero et al. report that there are A-theories which can be modelled in 
Scott's domains but not in coherent spaces [ ], and vice- versa [ ]. Thus, they 
focus on models based on coherent spaces. 

With this in mind, they define the class of lazy regular models which enables 
them to present two models: .Mi and M.2- These are built over the coherent 
space which is the minimal solution to the equation V m n&(X> =^ s T>). (Here, 
II is the coherent space with just one atom, & is the additive Cartesian product, 
and V =^. s V is the coherent space representing the stable functions from V to T>.) 
Bastonero et al. prove that M.\ has the same theory as M. L and conjecture that 
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M.2 has the same theory as M.ao- But both M.\ and M.i have a local structure 
which is different from those of M.i and M.ao- Interestingly enough, neither of 
M.\ and At 2 is fully abstract. 

Moran and Sands (1999) 

The operational semantics which [ ] presents is based on Sestoft's "mark 1" 
abstract machine for laziness, which is essentially a Kri vine-machine [ ] with 
heap updating. Although Moran and Sands' small-step operational semantics 
(Figure 2.9) is inspired by that of Ariola et al. [10, 7, 62], their system has 
no problem with recursion. They also argue that the observational equiva- 
lence/approximation of Ariola et al. — which only observes termination — 
cannot distinguish between call-by-need and call-by-name. The key feature of 
their work, on the contrary, is its notion for a unit of cost: the abstract machine 
reduction step. It is based on this that Moran and Sands come up with their 
notion of improvement: 

M is improved by N if all program contexts C, when <L[M] terminates 
then C[iV] terminates as cheaply. 

This gives rise to a variety of technical results including an operational theory 
which retains the computational distinction between call-by-name and call-by- 
need. 

Moran and Sands prove a context lemma for their semantics. Using their tick 
algebra, they show that their improvement relation has an inequational theory 
which validates the reduction rules of call-by-need calculi by Ariola et al [10, 
7, 62]. They, next, prove a syntactic continuity property which characterises 
improvement of a recursive function in terms of its finite unwindings. 4 Finally, 
Moran and Sands present two proof techniques: they adapt the improvement 
theorem and improvement induction as formerly studied for call-by-name in [93] 
and [94], respectively. 5 

Some notation is necessary for Figure 2.9: each configuration is of the form 
(r, M, S). T is a heap of bindings, M is an expression, and S is a stack of 
variable names, case alternatives, or update markers — denoted by j^x — for 
some variable x. a : S is a pushed on top of S. x^dom(r, S) means x is neither 
in the domain of T, nor in the domain of S. 

Ghaly et al. (2007) 

28] starts with a comparison between the operational semantics of Purushothaman 
and Seaman [89, 95, 96] and that of Launchbury [55]. They then offer yet an- 
other operational semantics in which A-abstractions are not considered to be 
values. However, they still claim that their operational semantics describes lazy 



4 This forms the basis of fixed-point induction style proofs. 

5 These techniques verify the correctness and safety of recursion-based program transforma- 
tions which proceed by local improvements. [69] 
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(T{x = M},x,S) -> (T,M,#x : S) (Lookup) 

(T,V,#x: S) -> (T{x = V},V,S) (Update) 

(T,M x,S) ^ (T,M,x: S) (Unwind) 

(r, Ax.M, y : 5} -> (V, M[y/x], S) (Subst) 

(r, case M of ate, 5) -> (T, M, alts : S) (Case) 

(T, Cj y, { Ci Xi -> iVj : 5) -> (r, Njfi/x&S) (Branch) 

(T,\et{x = M}\nN,S)^(T{x = M},N,S) xidom(T,S) (Letrec) 

Figure 2.9: The Abstract Machine of [69] 

evaluation. They claim that their operational semantics is based on that of Pu- 
rushothaman and Seaman with some minor differences. The relationship between 
their system and any other semantics for lazy evaluation is not clear though. 

Discussion Although very powerful, the framework developed by Abramsky 
and Ong [2, 73, 3], as well as that of Bastonero et al. [ ] do not form a good basis 
for our purpose because they do not model sharing. The work of Purushothaman 
and Seaman also do not form a good basis for our purpose either because they 
miss sharing for recursive functions. Moreover, the fact that they work on ordered 
bindings causes several difficulties which do not exist in Launchbury's work [55]. 
Ariola et al. [10, 7, 62] also fail to deal appropriately with recursion. On the other 
hand, Moran and Sand's interesting framework [( )] for measuring the amount of 
sharing in call-by-need is more than what we need because we are not to provide 
similar measures for observational-equivalence in presence of selective-strictness. 
Finally, we do not choose to base our work on that of Ghaly at al. [ ] because 
their work is not well-presented enough. 

The only work which remains then is that of Launchbury [ 5]. It turn out 
that his framework is even suitable for extending this thesis to parallelism. Our 
evidence for this is that the operational semantics of parallel languages like GpH 
[103] are already based on Launchbury's semantics. Furthermore, in this thesis, 
we provide new means for examining the behaviour of programming languages 
which drastically depend on heaps. (See Chapters 7 onwards.) Launchbury's 
notion of heaps fits this application perfectly. 

2.2 History of seq 

Before considering the literature on selective strictness, we should give a brief 
account of the semantics of seq of Haskell. 

As it turns out, the meaning of seq has undergone a change over time. His- 
torically, seq was first given the following (asymmetric) semantics ([77, Section 
6.2]) 

seq _L b = _L 

seq a b = b if a ^ _L . 
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From Haskell 1.3, however, whilst the same term seq was preserved, it was 
employed with a slightly and subtly different semantics which is symmetric in its 
strictness specifications 

seq _L a = _L 

seq a _L = _L 

seq a b = b. 

The old semantics of seq then tended to be given to pseq. 

Interestingly, some Haskell variations such as GpH or Eden were designed 
before this change in semantics. Therefore, their seq is not totally compatible 
with today's definition of seq in Haskell. In other words, Haskell is not 
backward compatible on seq. This backward incompatibility (since Haskell 
1.3) has caused a great confusion. It becomes especially confusing when one 
considers the research which predates Haskell 1.3. For example [J 1] correctly 
target the seq of their time, but, rather address today's pseq. 

This confusion is still present in the Haskell community. One campaign 
78, 63] believe that the denotational semantics of seq and pseq is the same whilst 
the only constraint on seq is that it is strict in both its argument (as chosen by 
Haskell 1.3+). They argue that this is the reason why the operational semantics 
of seq is not portable across Haskell implementations. This campaign believes 
in an implementation of pseq such as that of GHC [29] 

pseq x y = x seq lazy y 

where lazy x = x but lazy has a special meaning to the strictness analyser in GHC: 
there is no demand inferred on x in lazy x. The other campaign of the Haskell 
community [64, 115] believe that pseq is to force evaluation of its first argument 
to be done before starting the evaluation of its second argument. Consequently, 
in a heap where evaluating x implies evaluation of y, trying x pseq y will result 
in a blackhole whilst x seq y might reduce. So, the subtle difference between seq 
and pseq in words turns out to be considerable in the operational behaviour. In 
fact, pseq demands a lock mechanism such as the i— > bindings used in [11]. (See 
Figure 2.20 of this thesis for more.) 

In this thesis, we model selective strictness for the original seq (that of Haskell 
1.3- which was asymmetric) or today's pseq. We make this decision because we 
had originally chosen to prove observational equivalence between GpH programs. 
Interestingly enough, the related work on selective strictness, as well as the work 
of Hall et al. [36, 11] all consider the same seq. We review the literature on 
selective strictness in Section 2.3. 

2.3 Selective Strictness 

In this section, we review the work of three different groups of scholars who have 
studied the impacts of selective strictness from different points of view. The 
reviews are ordered chronologically. We end our review over the work of each 
group with a discussion on the relation between the respective studies and that 
of ours. 
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el = seq ((\ (Just x) y -> x) Nothing) 3 

e2 = seq ((\ (Just x) -> (\ y -> x)) Nothing) 3 

e3 = (\ *(x, Just y) -> x) (0, Nothing) 

e4 = case 1 of 

s | x==z -> (case 1 of w | False -> 33) 

where z - 1 
y -> 101 
e5 = case 1 of 

x I x==z -> (case 1 of w I True -> 33) 

where z = 2 
y -> 101 
e6 = let fftC 0=1 

fax n - n i (fac (n-1)) 
in fac 3 

Semantics^ mE el rhoO Uugs> el 

3 3 

Semantic s> mE e2 rhoO fiugs> e2 

Program error: {undefined} Program error: {e2_v2560 Maybe .Nothing} 

S.emantics> iuEl e3 rhoO [iu.gs> e3 

Program error: {undefined} Program error: {e3_.v2&&8 CNum..fromInt instHum_v35 B,,,,)} 

Semantics^ mE e4 rnoO fiugs> e4 

Program- error: {undefined} Program error: {e4_v2&62 [tJure.. freiElnt instHum_v3& 1)} 

Semantics* mE c& rhoO Hugs* e& 

101 101 

Semantics* mE g6 rhoO Eiugs> g6 

Figure 2.10: A worked out example of the unit tests provided in [37 

Harrison et al. (2002 and 2005) 

Harrison, Sheard, and Hook [-T ] start with noting that strictness enforcing mech- 
anisms in Haskell are: nested patterns, case expressions and let declarations, 
guards and where clauses on equations, strict constructor functions, the newtype 
datatype definition facility, and the seq operator. They give all these mechanisms 
a generic name: means for fine control of demand — all of which they study in 
[37]. To the best of our knowledge, this work is the first to pinpoint the neces- 
sity of special care in reasoning about explicit strictness in presence of mixed 
(strict/lazy) semantics. Although their work is based on a denotational seman- 
tics, they explain that the combinatorial interaction between all these strictness 
enforcing means makes a fully denotational approach non-trivial. Therefore, in- 
stead, they take what they call a calculational approach. That is, they explain 
every detail for how to implement a Haskell interpreter in Haskell itself; The 
interpreter is meant to implement the details of their denotational semantics. 
They do not prove any properties of their calculationally-specified denotational 
semantics. Nor do they prove correctness of their implementation, rather they 
report it to have undergone a considerable number of unit tests, some of which 
they also present in their paper. 

Figure 2.10 shows the head-to-head comparison of the Haskell programs el 
to e6 between the famous Hugs interpreter [50] for Haskell and the computationally- 
specified denotational semantics of [37]. rhoO is an environment and mE is the 
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function which performs evaluation of expressions in their system. 

Harrison and Kieburtz [38] then choose to adhere to the P-logic [52] of the 
Programatica project [43]. They explain why P-logic can capture selective strict- 
ness and therefore be an appropriate modal logic for verifying the behaviour 
of lazy programs in presence of selective strictness. This time, they provide 
a self-contained description of a typed, denotational semantics for the follow- 
ing Haskell fragment: abstraction, application and case expressions (without 
guards) 6 . Their denotational semantics is based on an extension to the type 
frames semantics of the simply-typed A-calculus [34, 6 '] and is closely related to 
their previous interpreter [ ]. They prove soundness of their P-logic inference 
rules with respect to this denotational semantics. 

As a summary, we would like to remind that, as also explained in Section 2.1, 
the models based denotational semantics do not form a suitable basis for our 
purposes. This of course includes the work of Harrison et al. [37, 38] which are 
fully denotational. 

van Eekelen and de Mol (2002, 2004, and 2007) 

In the hrst work of their thread [ 15], van Eekelen and de Mol outline the de- 
velopment of the Clean programming language [84] which is coupled with the 
Sparkle theorem prover [ ]. The purpose of this mutual development is to 
state and prove properties of the program on-the-fly. They start by motivating 
the reasons why programmers use selective strictness, and the available selec- 
tive strictness constructs. They define three sorts of strictness: mathematical, 
operational, and notational: 

• A function / is mathematically strict in an argument x if the 
result of f applied to x is undefined if x is undefined. 

• A function / is operationally strict in an argument x if the ar- 
gument is always reduced to weak head normal form before the 
function application is evaluated and the result of the function 
is undefined if the argument is undefined. 

• A function / is notationally strict in an argument x if the argu- 
ment is somehow explicitly annotated as such. 

Finally, using a variety of examples, they show how the use of some auxiliary 
functions can make formal reasoning about programs easier. They also show how 
their system can distinguish between "O" and "A.fi" even within their language 7 . 
In [106] van Eekelen and de Mol start by noting reasons strictness is practiced 
and the means for this in Haskell and Clean. It is this practice that motivates 
their study of the semantics of selective strictness. They first give a graph-based 
explanation on Launchbury's semantics [ 5]. Then they define strictness to be 
a property of a function. This does not seem to be the case for some other 
programming languages which support selective strictness such as GpH[103]. 



6 This only includes the strictness enforcing machinery for pattern matching which, as Har- 
rison and Kieburtz [. I] also demonstrate, is in close interaction with datatypes. 
r Here, fi represents a denotationally _L computation. 
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(r, x l i-> e x ) : x x J| 6 : z x © : e J| A : z 

(StrictLet) 

T : let! x\ = e\ in e JJ. A : z 

Figure 2.11: van Eekelen and de Mol's Rule for let! 

1. V/ jfl V xs [map(J o g)xs = map f (map g xs)] 

2. V xs [finite(xs) =5- reverse(reverse xs) = xs] 

Figure 2.12: Examples from [ ] semantics of which changes by addition of 
strictness 

The most significant point about [ ] is that it correctly extends Launch- 
bury's system for the case of explicit strictness, van Eekelen and de Mol's deriva- 
tion rule for let! is depicted in Figure 2.11, where z represents a value. They also 
prove all the theorems stated in their previous work. Their most notable theorems 
here includes correctness and computational adequacy of their proposed opera- 
tional semantics. Finally, they prove the following "folklore" pieces of knowledge: 

• expressions that are bottom lazily, will also be bottom when we 
make something strict; 

• when strictness is added to an expression that is non-bottom 
lazily, either the result stays the same or it becomes bottom; 

• expressions that are non-bottom using strictness will also be non- 
bottom lazily with the same result. 

The hnal work of van Eekelen and de Mol in this thread is [ ] which is an 
updated version of the previous ones: It gives some short but helpful examples 
on when and why strictness annotation changes the semantics of program, and 
updates their literature review. Figure 2.12 shows two of such examples where 
finite(.) is a predicate which examines finiteness of its (list) argument. 

As explained in Section 2.1, Launchbury's work offers the most complete plat- 
form for reasoning about the behaviour of lazy evaluation which happens to be 
the easiest for our purposes as well. The work of van Eekelen and de Mol cor- 
rectly extend Launchbury's system to selective strictness. Therefore, although 
van Eekelen and de Mol never prove any identities using their system, we also 
choose to base our work on their system. 

Johann, Voigtlander, and Seidel (2004, 2006, 2007, and 
2009) 

Johann and Voigtlander [47] focus on the effect of seq on free theorems. As 
first popularised by Wadler [113], free theorems are said to be free because they 
can be derived solely from the type of a function — virtually for free, i.e., with 
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Theorem 2.1. For every junction 

filter :: \/a. (a — > Bool) — > [a] — > [a] 
and appropriately typed p, h, and I, the following hold: 

• If h is strict, then: 

filter p (map h I) jZ map /i (filter (p o /i) /) 

• ifp^A. and h is strict and total, then: 

filter p (map h I) ZJ map /i (filter (p o /i) /) 

• ifpj^A. and h is strict and total, then: 

filter p (map h I) = map h (filter (p oh) I). 

Figure 2.13: A Free Theorem that Holds in Presence of seq. 

no knowledge of the actual definition of the function. Free theorems are used 
to derive program equivalences involving parametric polymorphic functions in 
programming languages based on Girard-Reynolds [31, 91] Lambda Calculus. 
Johann and Voigtlander show that although free theorems hold unconditionally 
for polymorphic functions in Girard-Reynolds calculus, they might fail in presence 
of strict primitives such as seq. They explain why seq causes this failure, and, 
develop a technique for recovering the free theorems failing in such situations. 
They also apply their technique to prove some famous free theorems as shown 
in Figure 2.13. As they explain, this technique can be used for recovering free 
theorems in a language as large as a subset of Haskell [ 7 ] that corresponds to 
Girard-Reynolds-style calculus with hxpoints, plus algebraic data types, and seq. 
In their second related work [ ], Johann and Voigtlander focus on an im- 
portant application of free theorems, namely, short cut fusion 8 [30] and more 
generally the related issues to program transformation. They use the results de- 
veloped in their previous work [47] to study the problematic aspects caused by 
seq for the free theorems associated with the foldr /build rule used in short cut 
fusion, as well as its dual — the destroy /unfoldr rule [101]. In both cases they 
demonstrate the usefulness of their machinery for recovering from the side effects. 
See Figure 2.14 for example. 9 The results of this latter paper are valid for a 
subset of Haskell as large as Pitts' PolyFix [ ] with seq. It is also noteworthy 



8 Short cut fusion is an optimisation technique, gist of which being the use of some local 
replacement rules to avoid unnecessary construction/destruction of intermediate lists. 

9 Interestingly enough, build in Figure 2.14 has a rank-2 type [57], i.e., it takes a polymorphic 
function as an argument. Haskell's current type-system does not support this. However, most 
Haskell implementations do support it as an extension. See [112] for more. 
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Theorem 2.2. For every closed type r, every function 

g::V(3.(r^ /? -► (3)^(3^(3, 
and appropriately typed c and n, the following hold: 

• foldr c n (build g) Zj g c n 

• if c J L 7^ _L and n ^ _L, then 

foldr c n (build g) = g c n. 

Figure 2.14: Recovered Free Theorem for foldr in [48] 
Let r G Typ and R, R' G Term(T). Write R •w R' for the following pairs: 



R 


fl' 


if 




(Ax :: t'.N) A 


iV[^/x] 


x :: t' h N :: t 




(ka.N) T , 


iV[r'/a] 


ahJV:: r" 




case nil T / of {nil =>• M;h:t^ M'} 


M 


/i :: r',t :: r'-list\- M' : 


: t 


case H : T of {nil => M;h:t^ M'} 


M'[H/h,T/t] 


/i :: r',t :: r'-list\- M' : 


: r 


fix(F) 


F fix(F) 






seq (V, M) 


M 


1/ G Value, 





where x, h, and £ are term variables, a is a type variable, r' G Typ, A,H G 
Term(r'), M G Term(r), T G Term (t' -list), F G Term(r — > t), and further 
types and terms that occur in the table are subject to the restrictions recorded 
on the right. 

Figure 2.15: Operational Semantics of PolySeq 

that [ ] covers all the proofs omitted in [17]. 

In their next work [110], Voigtalander and Johann explain that a denotational 
model would depend on correctness of some open conjectures. Therefore, they 
choose to work in an operational setting. They offer PolySeq for this purpose 
and for reasoning equationally. Voigtalander and Johann report that the shift to 
reasoning equationally is only a part of a longer plan for adapting operational 
settings to analysing free theorems in presence of selective strictness. 

PolySeq's syntax is that of Pitts' PolyPCF [(SO] plus seq. However, unlike 
PolyPCF, PolySeq has a small-step operational semantics which comes in Fig- 
ure 2.15. As Voigtalander and Johann explain, as long as free theorems are con- 
cerned, there is no difference between call-by-need and call-by-name. Therefore, 
to avoid complications, they design PolySeq so that it does not capture sharing. 
It is not a surprise then that they have a result like this [ , Corollary 7.5]: 
Proposition. For every A, B G Term, if A JJ., then seq (A, B) = b s B. 
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[110] shows how to prove the correctness, with respect to observational equiva- 
lence, of parametricity-based transformations on PolySeq programs. Voigtalander 
and Johann's example for this is again short-cut fusion. As they explain, their 
results work for implementations of Haskell like GHC [ ]. The special point 
about GHC is that it combines Haskell's selective strictness with impredicative 
polymorphism [i±~]- 10 

The forth related work of Voigtalander and Johann [ ] goes back to con- 
sidering observational approximation but still in an operational setting. [Ill] is 
mainly about constructing a parametric model of observational approximation for 
PolySeq. They show the correctness of their technique, and, in order to demon- 
strate it, they again examine the short cut fusion. [Ill] omits the proofs which 
are already done in [110]. 

The last work of this thread is [ ] where Seidel and Voigtlander combine the 
ideas of [47] and [56] to offer a new type-system. Their new type-system helps 
them to remove some unnecessary side-conditions for free theorems in presence 
of selective strictness. The solution provided in [ ] is two-fold: they first re- 
fine PolySeq's type-system to come up with PolySeq*. Seidel and Voigtlander 
prove that PolySeq* and PolySeq are equivalently expressive. However, PolySeq* 
provides a stronger parametricity theorem with a better algorithmic behaviour. 
Next, they provide PolySeq c which they show to be equivalently expressive to 
PolySeq* but has yet more refined type-system. The special use of PolySeq c is 
that it enables deriving minimal — in the sense of minimal logical relations — 
free theorems about the programs equally typable in PolySeq. This is through 
solving and minimalising the set of additional constraints that PolySeq c gener- 
ates during the typing of expressions. Some long proofs of [ ] are presented in 
[98]. 

We note that Johann, Voigtlander, and Seidel's study of free theorems in 
presence of selective strictness is broader than our study in this thesis. Addition- 
ally, their research addresses more sizeable languages than ours. However, their 
machinery targets inequational theorems, which technically means that they do 
observational approximation rather than observational equivalence. Moreover, 
their approach substantially relies on certain preconditions to hold so that they 
can recover free theorems. It turns out that, to recover equational versions, these 
preconditions need to be even stronger. 

As an example, Figure 2.16 depicts the free theorem derivable for seq in Poly- 
Seq, as provided by Johann and Voigtlander 's online demo 11 . (Note that, unlike 
us that use seq in an infix form, Johann and Voigtlander use it in a prefix form.) 
Taking f{x) = x and g{x) = x seq z, if it would have been the case that g 
was total, associativity of seq would have followed in PolySeq as well. How- 
ever, g is obviously not total. 12 With the aids of the techniques developed in 

], Seidel and Voigtlander remove the need for g to be total. Yet, as provided 



In short, predicative polymorphism means that one can only instantiate polymorphic func- 
tions with monomorphic types. Impredicative polymorphism is removing this restriction in 
instantiation of polymorphic functions. 

n http: //linux. tcs . inf .tu-dresden.de/~voigt/ft/ 

12 Call a relation %^ total when for every pair x, y such that x1{jy, x =£ _L implies y =£ _L. 
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The theorem generated for functions of the type 

seq :: \/ab.a —> b — > b 

in the sublanguage of Haskell with general recursion and selective strictness, equa- 
tional style, is: 

Vti,t 2 G TYPES,/? G REL(ti, t 2 ), R strict, continuous, and bottom-reflecting. 
Vt 3 ,t 4 G TYPES, S G REL(t 3 ,t^), S strict, continuous, and bottom-reflecting. 
V(x,y)£R. 

\/(z, v) G /!).(seq x z, seq y v) G S 

Reducing all permissible relation variables to functions yields: 

Vti, t 2 G TYPES, / :: t x -> t 2 , /strict and total. 
Vt 3 ,t 4 G TYPES, g :: t 3 — > t±,g strict and total. 
\/x :: t x . My :: £ 3 . g (seq x y) = seq (/ x) (g y) 

Figure 2.16: A Free Theorem about seq 

by their online demo 13 , for / :: t\ — > t 2 and g :: £3 — > t 4 they still demand 
seq tl ig y^ _L -^ seq t2 i4 ^ _L so that they can derive associativity of seq. 14 Similar 
constraints on types seem natural for free theorems because they are essentially 
derived from the types. On the contrary, we put no constraints on the types and 
our system works independent of typing. 

As a hnal comment, we would like to remind that the majority of works by 
this group take denotational approaches which despite their undeniable merit, 
as explained in Section 2.1, do not form a good basis for the properties we are 
after. On the other hand, the only two works of theirs which take an operational 
approach ([110] and [111]) are also not suitable for our purposes because they are 
designed not to capture sharing. Technically, this means that they study selective 
strictness for call-by-name as opposed to call-by-need. 

2.4 Other Related Work 

This section reviews further sets of related work in a chronological order. In 
outline, [J LG] provides an alternative mechanism for studying sharing; Hall et 
al. [36, 11] extend Launchbury's semantics to generalise the study of selective 
strictness to that of parallel coordination; Hidalgo-Herrero [ ] offers a denota- 
tional semantics in which they can prove similar identities to the ones we prove 
in this thesis; [18] studies reasoning about partial languages in the presence of 
selective strictness — although selective strictness by itself is not a concern of 
their research; Formal reasoning about functional programming is investigated in 



13 http: //linux. tcs. inf .tu-dresden.de/~seideld/cgi-bin/polyseq. cgi 
14 In this notation, f t denotes a polymorphic function / instantiated for type t. 
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[20]; Finally, [92] extends Launchbury's operational semantics [55] for the study 
of distributed lazy evaluation. 

Yoshida (1993) 

[116] presents a weak A-calculus that formalises functional execution with shared 
environments. This is called Xf which Yoshida proves to be tractable. She 
employs explicit substitutions which is reminiscent of the common practice of 
using let-binding in functional programming. (Xf is a calculus in that it comes 
with no reduction strategy.) She established the Church-Rosser property and 
normalisability of Xf. Based on those, Yoshida then shows how the left-most 
reduction strategy of her calculus is optimal in the weak execution scheme. Whilst 
proof sketches come in [ ], the complete proofs come in [117]. 

It is a fact that, for an arbitrary expression, the first occasion in which a vari- 
able is evaluated is not known until the run-time. Therefore, as also explained 
in [95, 96, 62], modelling lazy evaluation is not possible via simply providing a 
reduction strategy for the work like ["] with semantics based on explicit substi- 
tution. This is because such works simply copy the contents of substitutions for 
the case of application: 

App (ab)[s]^(a[s])(b[s]). 

Yoshida is careful to design a calculus which works perfectly for its purpose, 
i.e., optimal reduction. For this purpose, in certain situations, variables are 
fine not to be updated with the respective substitutions. However, this does 
not completely fit lazy evaluation where each variable is needed to be evaluated 
at most once. Besides, due to her notion of observation, reduction in Xf is 
not equivalent to reduction to weak-head normal form (whnf ) as is used in lazy 
programming languages. 



Hall et al. (1998 and 2000) 



is the first work in this thread, which starts by noting that parallel programs, 
in addition to computation, must also specify coordination. That is, more than 
just what to compute (which is of concern to any program), parallel programs are 
concerned about how to arrange the computation too. The authors use the two 
primitives seq and par of GpH[ ] as the means for this purpose, with definitions: 

e par e 1 = e\ 

_L if e = _L 



e seq e 1 = , 

" e\ otherwise 

(As discussed in Section 2.2, this asymmetric definition for seq agrees with the 
respective definition of Haskell standard of the time [36] was written.) 

However, they express their wish in their model to be general enough for in- 
vestigation on the validity of identities such as the ones in Figure 2.22. 15 For 



15 Hall et al. however do not suggest using the [.] denotation for establishing the identities. 
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w, x,y, z G Var 

e G Exp ::= Xx.e 
V x 
x 

let {x i =e i } T l =1 ine 
x par e 
x seq e 
A,6 G Heap = Var ^ Exp 
v G Val ::= Xx.e 



Figure 2.17: The Syntax of Hall et al. [36] 

that, their model is of a slightly higher level of abstraction than the mere "pro- 
gramming language" GpH. That is, as high as speculative evaluation only. (See 
our review on their next work [ ] which is closer to the actual GpH programming 
language.) Their semantics is an extension over that of Launchbury [ ]. One 
key difference between theirs and the latter is that their semantics is a small-step 
one. They argue that, because of the coordination requirements between thread 
creation and destruction, consideration of behaviour at the level of single reduc- 
tion is needed. Therefore, their choice for small-step semantics fits naturally to 
the needs of parallelism. 

Figures 2.17 and 2.18 show their syntax and operational semantics, respec- 
tively. 16 For explanations on their rules, consult [36] where they also offer sample 
reductions. The similarities between their work and that of Launchbury does not 
stop here in that they prove: Firstly, that in the case of single resource (when 
|T| = 1 in rule (Product) of Figure 2.18) their system simulates the latter one. 
And, secondly, that in the presence of more than one resource, the computations 
are fully serialisable. Consequently, even in the presence of parallelism, their 
systems correctly models sharing. 

The next related work of this group is [11]. This work is much closer to 
the real operational semantics of GpH [58] in that, for instance, the bindings 
are labeled to model GpH threads — as (A)ctive, (i?)unnable, (J)nactive, and 
(5)locked — and, the semantics is parameterised with the number of processors 
(N). Like [36], they present their semantics in a two-level transition system; the 
lower level pursues single-thread transitions, where the upper level manages their 
combination. Although their semantics is still small-step, it is at a higher level 
than most abstract machines. For example, it has no need for stacks or blocking 
queues. 

Figure 2.19 shows the syntax of this work, and Figure 2.20 depicts part of 
their operational semantics. Note the following notational conventions in these 
Figures: (1) x \— > e stands for x i— > (e, a), (2) e x ::= x \ x y \ x seq e'. The 
most related part of Figure 2.20 is its (seq) and (seq-elim) rules. There are two 
differences between this couple of rules and our (seq) rule. (See Definition 3.4 for 



16 Note that despite Launchbury who employs z for values (whnf expressions), they employ v 
for this purpose. 
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Sequential Rules 

A : (y i-> let {x i =e i }% =iX in e) 

-> (A, Si i-> ei, . . . , ar n i-> e„) : (y i-> e) (Let) 

(A,x i— > Aw.e) : (2 h d j/) -> (A, a: 1— > Aio.e) : (2; 1— >• e[y/w]) (Application) 

(A, x 1— > u) : (y 1— > x) — > (A, s 1— > u) : (y 1— > w) (Variable) 

(A : 1 h u) : (z hi seq e) — > (A, x 1— ► u) : {z 1— > e) (Sequence) 

A : (z h x par e') — ► A : (2 t— > e') (Parallell) 

(A, 1 h e) : (z h 1 par e') — > A : (x \—> e, z \—> e') (Parallel2) 



Parallel Rules 

A : Ti -> A : r- , 1 < z < n red n red + m a < max 



A:r-r d u(jAAA„:A u(jTi 



(Product) 



Figure 2.18: The Operation Semantics of Hall et al. [ 



x,y,z G Var 

n G Number 

e G Exp 

e ::= n | x \ e x \ Xx.e | let {x i =e i \ 1 l =l in e 
I ej seq e 2 | x par e 

H,K G Heap = Var ->■ (Exp, State) 

a,/3 G State 

a ::= Inactive \ Runnable \ Active \ Blocked 

v ::= n I Ax.e 



Figure 2.19: The Syntax of [11] 
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H : z i— ► let {xi=&i\ 1 i = \ in e — > ({xj i— > ej}" =1 , z^-se) (Zei) 

(77, x i— > v) : 2 i— > x — > (z i— ► £>) (war) 

(iJ, x i— > e) : 2 i— ► x — > (iHe,zi->i) (blocki) 

RAB A B 

(H,x i— > e) : 2 i— > x — > (2 1— > x) (block?) 

H : z \—> (Ay.e) x — > (2 1— > e[x/y\) (subst) 



H : z 1— » e — > (if, 2 1— > e') 



(app) 



ii : 2 1— > e x — > (if, z h e' 3;) 

iJ : 2 1— > w seq e — > (2 1— > e) (seq-elim) 



A 



H : z i— >• ei — > (if, z h e') 
// : 2 1— > ei seq e 2 — > (if, 2 ^ e^ seq e 2 ) 



(seg) 



(iZ, x 1— > ei) : z 1— ► x par e 2 — >• (2 1— ► e 2 ) (par-elim) 

Figure 2.20: Part of the Semantics of [11] 

our (seq) rule.) Firstly, this is a small-step treatment of Haskell's seq operator 
whereas we deal with the same operator in a big-step way. Secondly, for the sake 
of modelling parallelism as well as selective strictness, their bindings are labeled 
(with 1— > and 1— > in this case). 

In this work of theirs, they extend Launchbury's denotational semantics for the 
case of par and seq. With respect to this, they prove Soundness, Computational 
Adequacy, and Determinacy of their system. Most interesting out of that list, is 
perhaps Determinacy which states that "the same result is always obtained irre- 
spective of the number of processors, and irrespective of which runnable threads 
are chosen for activation during the computation". Afterwards, they show the 
richness of their semantics in that, not only it can model GpH, but — with 
vary small manipulations — it can also reflect the behaviour of other evalua- 
tion strategies such as Sequential Evaluation, Fully Speculative Evaluation [33], 
Non-deterministic Choice [65, 68], and Controlled Speculative Evaluation. They 
finalise this work by formally defining Work, Runtime, and Average Parallelism 
(all of which they had only informally defined in their previous work [ ]), in 
addition to Maximum Parallelism. 

In summary, the work of Hall et al. [36, 11] suggest an interesting framework 
for extending our study of observational equivalence to include coordination as 
well as selective strictness. The second and third identities in Figure 2.22 are 
examples of observational equivalence in such a framework where par is a coordi- 
nation mechanism. This is subject to future work. 
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Semantic Domains 







Cont 


K 


e 


ECont 


P 


e 


Env 


V 


e 


Val 


e 


e 


Eval 


a 


e 


Abs 


V 


e 


Clo 



Env — > Env continuations 

EVal — > Cont expression continuations 

Ide — > (Val + {undefined}) environments 

Eval + Clo + {not_ready} values 

Abs expressed values 

Ide — > Clo abstract values 

ECont — > Cont closures 



Evaluation Function 

£[#]« = force x k 
£{\x.e\K= K{\x.£\el) 
£\x x x 2 }k = SIxiJk' 

where k 1 = Xe.Xp.e x 2 k p 



£\x\ seq x 2 ]k = £[xi]k' 
where k' = Ae.Ap.5[x2] K p 

£\xi par x 2 \k = 5[x 2 ]k' 

k' = Xe.Xp.K, e p 
p par = par xi p 



where 



par 



where 



5 [let {xi = ei}f =1 in x\k = Xp.£\x\n p' 
{yi\l=i = newlde n p 
p' = P © {y* >-> £le i [y j /x J ]] =1 j | 1 < i < n} 

Figure 2.21: Denotational Semantics of Hidalgo-Herrero [40] 



Hidalgo-Herrero (2004) 

Using the fully denotational approach shown in Figure 2.21, Hidalgo-Herrero [40] 
proves a number of denotational identities about programs containing seq and 
a parallel composition, par. The key identities are shown in Figure 2.22 where 
£ :: Exp — > ECont — > Cont. The way po is defined turns out to be more subtle 
than it might be expected. Although the model of Hidalgo-Herrero [40] does not 
capture sharing, it still considers more than our ~„ (Definition 4.1), i.e., simply 
the final value of an expression. However, unlike our ~ s (Definition 4.3), they do 
not observe the entire environment (or heap in Launchbury's semantics [55]). For 
example, for the first identity of Figure 2.22, x, y, and z are observed whilst other 
implicated changes to the environment are not considered. Therefore, currently, 
it is unclear what the correspondence is between their denotational identity and 
the operational equivalences in this thesis. This is future work. 



£\x seq (y seq z)} k p = £\{x seq y) seq z\ k p 

£{x par (x par y)] k p = £\x par y\ k p 

£\x seq (y par z)J k p = £\(x seq y) par (x seq z)\ k p 



For appropriate kq and Pq. 



Figure 2.22: Related Identities Proved in [40] 
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Danielsson, Hughes, Jansson, and Gibbons (2006) 

[18] aims to justify the common practice in assuming that equational laws in 
total languages are also correct in partial ones. For this purpose, they define two 
languages with the same syntax, but one total and the other partial. The key 
result is then to show that when two closed terms have the same semantics in the 
total language, they will have related semantics in the partial one. 

This is achieved through defining two denotational semantics for their lan- 
guages: a set-theoretic ([.]) and a model-theoretic (((.))). They then define their 
moral equality relation (~): 

• If a type a does not contain function spaces, then x ~ CT y iff x and y are 
equal total values. 

• For functions, / ~ g iff / and g map (total) related values to related values. 

Using this, finally, they show that for any two terms t\ and £2, two pairs of 
contexts pi, p[ and p2, p' 2 both satisfying a certain condition, ((£i))Pi = ((£2)^2 
implies [tijpi ~ [*2jp2- I n words, they prove that if two terms are equal in the 
set-theoretic world, then they are morally equal in the model-theoretic world. As 
an example of this, they demonstrate that 

([revMap o mapRev))[y 1— > n] = id =5- \{revMap o mapRev) xs\[y 1— > n] = [xs] 

for revMap = reverse o map(\x.x — y), mapRev = map(Xx.x + y) o reverse, and 
some natural number n. The commonality between [ ] and this thesis is that 
selective strictness is available in both languages studied in the former. However, 
there is no special focus on the role of selective strictness in their work. Despite 
that, the existence of seq in their syntax is perhaps because seq is one possible 
mechanism to produce partial definitions in Haskell. 

de Mol (2009) 

de Mol's PhD thesis [! ] is on formal reasoning about functional programming. 
More specifically, it is on a tool dedicated to this purpose: the Sparkle proof 
assistant, mutually developed with Clean. The reason why [ ] remains related 
to this thesis is that complete Sparkle proofs underpin how a program deals with 
_L as well as the circumstances under which a program yields _L. In other words, 
they have special means for dealing with definedness properties of programs. 

de Mol's thesis starts by giving a history of software bugs and the methods to 
cure/prevent them. Amongst those methods, de Mol defines formal reasoning to 
be "the mathematical process of building proofs", and explains why they chose 
this option. The remainder of his thesis comprises of nine papers de Mol has either 
been a major author of or has co-authored. Three of these are already studied 
in Section 2.3 because they were dedicated to selective strictness [105, 106, 108]. 
Here, we review the rest, each of which comes as a chapter in [20]. 

The first paper [ ] reports the development of CleanPrOVErSySTEM — a 
prototype for Sparkle — which works for a small subset of Clean. Example 
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VaVxs G a-list. reverse {reverse xs) = xs (2-1) 

VaVxs G a-list\/n G N. {take n xs) ++ {drop n xs) = xs (2-2) 

Vx, y, z G N. x (l/+z) = x y xx z (2.3) 

Vx G N. log T = x (2.4) 

Figure 2.23: Some Theorems Proved in CleanPrOVErSySTEM 

theorems proved in CleanPrOVErSySTEM are shown in Figure 2.23 but side 
conditions like n/1 are not considered for instance for (2.2). A more complete 
list of theorems proved in CleanPrOVErSySTEM can be found in [19] which con- 
tains all the 72 theorems in Bird's famous textbook [16]. CleanPrOVErSySTEM 
is still available online. 1 ' 

The second paper [ I] explains why the integration of Sparkle with Clean 
is preferred over indirection via common theorem provers such as PvS [ ], COQ 
[102], and Isabelle [75]. This is because Sparkle itself has a semantic based 
on lazy graph term-rewriting which allows reasoning to take place directly on 
Clean programs rather than their translations. Yet, Sparkle reasons on the 
translations of Clean programs into Core-Clean. Core-Clean is a subset 
of Clean which only supports application, sharing, and case distinction with 
a special support for strictness annotation. 80% of theorems proved in Bird's 
textbook [16] are reported in [22] to be successfully proved using Sparkle. Most 
importantly, the side condition n ^ _L is now added in for (2.1). 

The third paper [24] is a tutorial of Sparkle for (functional) programmers. It 
first explains the concept of formal reasoning in general. Several design decisions 
are particularly justifies for Sparkle then — most importantly the integration 
of Sparkle in the Clean development studio. Finally, a brief tutorial on how 
to use Sparkle in the basic and advanced level is given. Demonstrating the 
effects of laziness and strictness annotation on Sparkle is of special interest to 
this tutorial. 

de Mol starts his next chapter [ ] by exemplifying the reformulation needed 
in the statement of theorems such as (2.1) upon changes in strictness proper- 
ties. Next, it shows how Sparkle's support for explicit strictness facilitates such 
reformulations using appropriate modifications in proof rules. Two expressions 
are considered interchangeable in [ 7 ] when they either both compute the exact 
same value, or they do not reduce at all. Interestingly enough, our ~„ (Defini- 
tion 4.1) demands the same condition to hold between two expressions but for 
evaluation in every heap. 

In [25], a confluent term graph rewriting system is presented which is similar 
to Launchbury's operational semantics for lazy evaluation (Fig. 2.5). The rewrite 
system of [25], however, has a small-step semantics and therefore allows partial 
and inner reducts as well. It is also explained in [25] why their freedom of choice 
of redexes forms a suitable basis for formal reasoning. Finally, they show that a 



17 



http: //www. cs .ru.nl/~maartenm/CleanProverSystem/ 
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Definitions: 

Oi ~ 2 & (Voi G Oi3o 2 G 2 -[oi C o 2 ]) A (Vo 2 G 2 3oi G 0i.[o 2 Q oi]) 

■01 ~ ^ 2 <^> Output(<ipi) ~ Output(ip 2 ) 
Equal(e u e 2 ) oVxG V e Ve G £°P ened 

[0uipui(run e xl _> ei with -0) ~ Output(run e x ^ e2 with 0)] 

The following terms are semantically equal 

repeat 1 (2-5) 

let x = [l,l]::x in [l]::x (2.6) 

let x = [1, 1, 1, l]::x in cc (2.7) 

repeat 1 ++ [2, 3, 4] (2.8) 

repeat 1 ++(let x = x in x) (2-9) 

Figure 2.24: Semantical Equality in Chapter 8 of de Mol's Thesis 

special strategy for their system behaves correctly with respect to Launchbury's 
operational semantics. 

The contents of the next chapter in de Mol's thesis [ ] is a simplified ver- 
sion of Chapter 7 in [23] which focuses on a formalised semantics for expression 
equality and its properties. Suitability of the special rewrite system of [ ] is 
demonstrated this chapter of his thesis. The building blocks of de Mol's defini- 
tion of semantical equality between two expressions (Equal^(., .)) can be found in 
Figure 2.24. (For precise definitions consult [23].) Figure 2.24 also contains exam- 
ple de Mol's semantical equalities. Sparkle currently cannot prove the equalities 
in Figure 2.24 because of its lack of co-induction. This chapter of de Mol's the- 
sis proves reducibility of his semantical equality and reports that its referential 
transparency is also proved in [ ]. (In a nutshell, reducibility means that equal- 
ity is invariant under reduction and referential transparency means that, within 
a given scope, all occurrences of an expression denote the same value.) 

Formal reasoning about general type classes is the main concern in [109]. This 
is first shown not to be possible in Isabelle — the only proof assistant that the 
authors of [ ] knew that could support theorem proving about type classes 
by the time of their writing. Next, they show how they have developed a new 
inductive scheme which results in a strong and effective proof rule and tactic for 
assisting proofs about general type classes in Sparkle. 

In [4], a common formal model is first developed for a certain class of desktop 
and web applications. This is based on a point-free style of Arrow combinators 
[45] for the GEC [ , ] and iData [82, 83] GUI design toolkits. This common 
formal method is then used for formal reasoning about GUIs developed under 
both GEC and iData. 
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Figure 2.25: Distributed Model of Eden (Left), Process Creation for x#y (Right) 



Sanchez-Gil, Hidalgo-Herrero, and Ortega-Mallen (2009) 

Sanchez-Gil, Hidalgo-Herrero, and Ortega-Mallen [ ] update the former small- 
step operational semantics for distributed lazy evaluation [40] which corresponds 
to the programming language Eden[ ]. Their operational semantics is based on 
Launchbury's natural semantic for lazy evaluation [ ] and is very close to the 
one presented in [11] for GpH. Figure 2.25 (Left) demonstrated the computation 
model of Eden where HiS are heaps. 

The main difference between this model and that of GpH is that, although 
bindings are unique and shared inside each individual heap, duplication is allowed 
across heaps. As a matter of fact, in Eden, x#y is a means for the programmer to 
trigger parallel application. As shown in Figure 2.25 (Right), upon the evaluation 
of a binding z i— > x#y in a process P, a new process C is created with enough 
bindings to carry on with the evaluation of the application x y. The two processes 
P and C communicate using channels i and o. 

The operational semantics of [92] consists of local rules and global rules. Their 
local rules as well as the global rules that do not consider parallel application 
correspond closely to those of the operational semantics of GpH (Figure 2.20). 
Therefore, Figure 2.26 only depicts the rules for x#y. Note the following conven- 
tions in Figure 2.26: S represents the rest of the system (consisting of a number 
of processes); processes are of the form (p, H) where p is the process ID and H 
is its corresponding heap; finally, when dom(Hi) fl dom(H2) = 0, the notation 
H\ + H2 is defined to be H\ U H2, and is undefined otherwise. 

Sanchez-Gil, Hidalgo-Herrero, and Ortega-Mallen [92] prove that their oper- 
ational semantics is correct and computationally adequate with respect to the 
denotational semantics of Abramsky and Ong [72, 2, 3]. This is done via two 
intermediate semantics: a one-processor version of their operational semantics, 
and an extension over Launchbury's operational semantics [55]. They also prove 
the determinacy 18 of their operational semantics. 



18 Determinacy in this context means that the result of a computation is not dependent on 
the number of processors. 
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Syntax 

E ::= x | Xx.E \ x y \ x#y | let {xi i— > e,}" =1 in cc 

Parallel Application Rules 

(blocking p.c.) (S, (p, H + {6 &x#y})) ^ (S, (p, H + {6 A x#y})) 
(proc. creat.) if -id (a;, H + {6 \—> x#y}), q, z, i, o fresh, rj fresh renaming 
(S,<p,H + {0Ax#y}))Z 

R A An 

(S, (p,H + {9 i-> o, i i-> y}), (q, 7](nh(x, H)) + {o i-» rj(x) z, z i-» i})) 



(value comm.) if — id(W, iJ p ), 77 fresh renaming 



B 



(S, (p, H p + {ch^ W}), (c, H c + {6^ ch})) 



A 



(S, (p, H p ), (c, H c + 77(nh(W, H p )) + {6 A V (W)})) 
Auxiliary Functions 

,/ tt\ _ { t rue if # ^ e G nh(x, H) with (E = chV E = y#2) 



/aZse otherwise 
nh(£',i/) is the greatest set of bindings satisfying (i) and (ii) 

(i) iAEeif^{iH£}u nh(£, ii") C nh(x, #) 
(ii) U nh(E',H) Cnh(E,H) 

E'subexp(E) 

Figure 2.26: Global Rules of [92] Related to Parallel Application. 
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Summary In this chapter, we first consider the literature on call-by-need in 
Section 2.1. We justify our design decision in basing our work on that of Launch- 
bury [55]. Next, in Section 2.2, we take a look at the change the semantics of seq 
has experienced over the time. We clarify the seq semantics that we choose to 
study for selective strictness. In Section 2.3, we review the literature on selective 
strictness which is the main concern of this thesis. We design our operational 
semantics to be a variation of that of van Eekelen and de Mol [10< ] which ex- 
tends that of Launchbury [ ]. The details of our reasons for making this design 
decision is explored in Section 2.3 where we summarise the work of each group 
with a discussion on how they fit our purposes. Finally, Section 2.4 explores other 
related work. 
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Chapter 

Operational Semantics 



In this chapter, we first present an operational semantics in Section 3.1. Next, in 
Section 3.2, we explore a number of fundamental properties of this operational 
semantics. Proofs of Section 3.2 are all based on the operational semantics of Sec- 
tion 3.1. The results in Section 3.2 will later be useful in this thesis. In particular, 
Determinism (Theorem 3.9) is central to our entire system, and Lemmas 3.10 and 
3.11 will be useful in Chapters 6 and 7. 

3.1 Syntax and Operational Semantics 

We start by presenting some syntax (Definition 3.1). We then give a big step 
operational semantics based on Launchbury's [55], and closely related to [106] 
(Definition 3.4). 

Definition 3.1. Fix a countably infinite set of variable symbols. x,y,z,... 

and xi, x 2 , x 3 , . . . will range over variable symbols. 
Define expressions and values by 

e ::= x | Xx.e | ex | let {x i =e i }" =1 in e | e seq e 
v ::= Xx.e 

e, e 1 , ex, ... will range over expressions, v, v 1 , v\, ... will range over values. 

We quotient syntax up to a-equivalence of X- and \et-bound variables. For 
example, Xx.x = Xy.y. We write e[y/x] for the usual capture-avoiding substitution 
of x by y in e. For example, (Xy.x)[y/x\ = Xy'.y, if x, y, and y' denote distinct 
variable symbols. 

Definition 3.2. Define fv(e) the free variables of e by: 

fv(x) = {x} fv(Xx.e) = fv(e)\{x} fv(ex)=fv(e)Li{x} 

n 

fv(\et{xi=ei}? =1 me) = (fv(e) U |J.M e i)) \{xi,...,x n } 

i=l 

fv(e' seq e) = fv(e') U fv(e) 
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Note that we write (fv(e)u{J™ =1 fv(ei))\{xi, . . . ,x n } and not (fv(e)\{xi, . . . , x n })U 
Ui=if v ( e i)- Here, we are following [ ] and allow let to express recursion. See 
Remark 3.5 for further explanation. 

Definition 3.3. If f is a partial function write dom(f) for the domain of f . In 
symbols, dom(f) = {x | f{x) defined]. 

Call a partial function Y mapping variable symbols to expressions and such 
that dom(T) is finite, a heap. V, A, 0, and S will range over heaps. 

Suppose x ^ dom(T). Define (T,x \—> e) by: 

(r, x i— > e){x) = e 

(T,x^e)(y) = T(y) y G dom(T) 

(T,x i— > e)(y) undefined y G - dom(T) U {x} 

Also write 

(r, Xi i-> ei)\ =1 for (r, xi i-> ei), and 

(r, Xi ^ e*)? =1 for ((r, x { ^ ei)^, x n ^ e n ). 

Definition 3.4. Define an operational semantics Y : e JJ. A : v by: 

(]am) T-.ei^A-.v 

-n \ u -n \ \ lam ) (var x ) 

r : Xx.e $ T : Xx.e ^ x ^ e ) • x ^ ( A , x^v) : v 

T : e ^ : Xy.e' : e'[x/y] fy A : v 



T : ex J| A : v 
(r, Xi^ei) n i=x :e \ (A, Xi^e^ n i=x :v (x t fresh, 1 < i < n) 



T : let{xi=ei}" =1 ine^ A : v 

r : e x ^ : v x : e 2 ^ A : v 2 

r : ei seq e 2 JJ- A : v 2 



(app) 

C n) 

(let) 



(seq) 



In (var x ), recall that by Definition 3.3 we assume x^dom(T). 

In (let) 'xi fresh' means that Xi G - fv(v), and Xi G - dom{Y) and Xi G - fv(Y{x)) 
for any x G dom(Y), and similarly for A. x 

Remark 3.5. Recall that we equate terms up to a-equivalence of let-bound 
variables. It is convenient to impose a well-formedness condition on derivations, 
that the new variable names above the line in (let) are introduced distinct (so 
they do not 'clash' with other variable names elsewhere in the derivation). This 
standard 'trick' is often called Barendregt's naming convention [12]. 

A particularly interesting effect that this has in our operational semantics is 
how we define free variables for let expressions (Definition 3.2). For any expression 

e = let {xi=e i \ r l =1 in e' 



1 Consistent with [■)■')] and subsequent work, we allow the possibility that Xj 6 f v ( e j) or 
Xi efvie'j) for 1 < i,j < n. 
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to be evaluated, all occurrences of x,'s in e address the local let-bound a^'s. 
Especially, so are the free occurrences of a^'s which are therefore needed to be 
distinguished from free variables of e. 

We take a moment for a few observations: 

• Xx.e is the usual functional abstraction, ex is the usual function applica- 
tion. This syntax is restricted (it has to be e applied to a variable symbol) 
but this just simplifies our mathematics; expressivity is not lost because we 
also have let. This is standard; we inherit it from [55]. 

• We identify syntax up to a-conversion. [55] does not. The z in his Variable 
reduction rule [55, Figure 1] makes (var x ) above look different from Variable. 
But z is just his notation for 'an a-equivalent term to z\ Thus, Launchbury 
isolates all the a-conversion needed in his reductions in that single rule. 2 
Our design choice is less computational, but we prefer the more abstract 
representation of syntax. 

• We 'garbage-collect' the bindings Xi i— > e\ in the final heap (the A) in (let), 
unlike [55, 105]. We will discuss this point in some detail: 

This design choice brings us benefits later. Thanks to 'garbage-collection', 
let-bindings are not propagated 'outside their scope' to the final heap and 
so we do not have to reason explicitly 'up to' these choices. (For example: 
without a garbage-collecting (let) rule we would have to 'hard-wire' the 
same effect into Theorem 8.4 and Corollary 8.6.) 

The rule for let in [55, 105] does not garbage collect. For them, heaps are 
left with extra bindings after a let-binding is evaluated. 

The semantics in [55, 105] allows variables to escape their scope during 
evaluation, which is forbidden in our semantics. For example, let x = 
Xy.y in Xz.(xz) will evolve in [ ] to place a binding x i— > Xy.y in the heap; 
this could then be 'accidentally' bound to an x occurring elsewhere, outside 
the scope of the let. Launchbury is well aware of this issue and comments 
on it [35, Section 3.1]. To avoid 'accidental name-clash' in evaluations in 
his system, Launchbury imposes a normalisation procedure which chooses 
all variable names distinct before evaluation begins. 

This is fine, if we just want to evaluate a particular expression, normalised, 
in a particular heap. However, for reasoning about the evaluation of classes 
of programs and proving operational equivalences between them, a garbage- 
collecting rule for let is better. 

On the other hand, our specific kind of garbage-collection causes restrictions 
to the expressiveness of our system. For example, the arguably reasonable 



2 Note that in [55], z ranges over values. Also, note that the normalisation mentioned in 
[55, Section 3.1] describes a translation from full A-calculus syntax to a target syntax which is 
the restricted, mathematically convenient, language which Launchbury's operational semantics 
uses; implicit in that procedure is an a-conversion step, but that is in the full A-calculus syntax 
and this should not be confused with a-conversion, or a-convertability, in the target syntax. 
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program e = let y = Xz.z in Xx.y does not reduce in our system. Suppose 
on the contrary that II is the derivation r : e JJ. A : v for some heaps T, A, 
and value v. Then, II takes the form 

(lam) 

ru{j/M Xz.z} : Xx.y JJ. T U {y i— > Az.z} : Ax.y 

(let)*. 

T : e Jl A : Xx.y 

However, this is not correct because the above (let)* is not a valid instance 
of (let). The reason is that y G fu(Xx.y) which is not allowed by our 
operational semantics. See Section 11.3 for more on this. 

• The intuition of e-y seq e 2 is 'evaluate e x ; throw the result away ... but 
keep the heap and use it to evaluate e% . This is very similar to the Clean 
(StrictLet) rule [106, 84]. This operational behaviour of seq is the tech- 
nical focus of this thesis and further discussion of its behaviour will follow. 
See Remark 5.2 for example for a more precise explanation of the above in- 
tuition. For a discussion on a subtly different selective strictness enforcing 
mechanisms of Haskell see Section 2.2. 

Definition 3.6. A derivation is the labelled tree — labelled with terms, heaps, 
and derivation rules from Definition 3.4 — that justifies a reduction T : e JJ- A : v. 
U, 111, n' ; H x , . . . will range over derivations. 

We may write T : e JJ- A : v ' as shorthand for T : e JJ. A : v is derivable". 

We may write T : e JJ-n A : v ' as shorthand for TI is a derivation of T : e JJ. 
A :v". 



3.2 Fundamental Properties 

In this section, we provide a few results which become handy later, and present 
fundamental properties of our system. In particular, Lemma 3.7 serves proof 
of determinism (Theorem 3.9) which is of special importance. It is noteworthy 
that Theorem 3.9 is correct for Launchbury's system [55], as well as that of van 
Eekelen and de Mol [105, 106, 108]. 

Lemma 3.7. IfT : e JJ. A : v then dom(T) = dom(A). 

Proof. Suppose T : e JJ. n A : v. We proceed by induction on II, based on its final 
rule: 

• (lam). II takes the form 

(lam) 

T : Xx.e JJ- T : Xx.e 

in which case the result is trivial. 
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(var x ). II takes the form 



T' : e $ A' : v 

(var x ) 



(r', x i— > e) : x § (A', x i— > i>) : v 



where T = (T',x i— > e) and A = (A',x i— > u). By inductive hypothesis, 
dom(T') = dom(A'). The result follows by noting that dom(T) = dom(T')U 
{x} and dom(A) = dom(A') U {x}. 

(app). II takes the form 

r :e^0:Ay.e' : e'[x/y] $ A : v 



T : e Z JJ. A : v 



(app). 



By inductive hypothesis, dom(T) = dom(Q). Similarly, by inductive hy- 
pothesis, dom(O) = dom(A). The result follows by transitivity. 

(let). II takes the form 

(r, Xi ^ e^ =1 :e^(A,Xi^ e[)™ =l : v 



T : \et{x i =e i } 7 l =1 \ne J| A : v 



(let). 



By inductive hypothesis, dom((T,Xi i— > e,)" =1 ) = dom((A,Xi \— > eQ^). 
The result follows by noting that dom((r,x i i— > ei)" =1 ) = dom(r) U {^i}7=i 
and dom((A,Xi i— > e i )^ =1 ) = dom(A) U {a?i}™ =1 . 

(seq). II takes the form 

r : ei i- G : v x Q : e 2 ij, A : v 2 



r : ei seq e 2 JJ- A : w 2 



(seq). 



By inductive hypothesis, dom(r) = dom{Q). Similarly, by inductive hy- 
pothesis, dom{Q) = dom(A). The result follows by transitivity. 

□ 

Notation 3.8. Write U =\ eta n' lu/ien II and H' are equal up to renaming let- 
bound variables. 

The following theorem shows that derivations are unique up to an easy re- 
naming, and reduction is deterministic: 

Theorem 3.9. If T : e lj, n A : v and T : e lj, u , A' : v' then II =| eta II', A = A', 

and v = v' '. 
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Proof. Induction on II based on its final rule: 

• (lam). Trivial. 

• (var x ). Let V = (T',x i— > e), Ai = (A[,x i— > v\), and A2 = (A' 2 ,x 1— > V2). 
Suppose that L : x JJ. ni Ai : V\ and Y : x JJ-n 2 A 2 : v 2- Accordingly, 11^ and 
n' 2 exist such that r" : e JJ. n / A[ : V\ and V : e JJ. n / A 2 : i>2- By inductive 
hypothesis then «i = V2, A[ = A 2 , and H^ =i e ta II 2 . The result follows. 

• (app). Suppose that 

— T : e J^m © : Ay.e and : e[x/y] J|n 2 A : i> and 

- r : e JJ-ni ©' : Ay.e' and 0' : e'[x/y] JJ. n j A' : v' 

(we a-convert the bound variable y so it is the same in both derivations). By 
inductive hypothesis = 0' and Xy.e = Xy.e'. Thus e[x/y] = e'[x/y\. We 
use the inductive hypothesis. On the other hand, by inductive hypothesis, 
IIi =ieta n^. Likewise, II 2 =\ eta II 2 . The result follows. 

• (let). Suppose that 

(r, Xi ^ e;)f =1 : e JJ. Hl (A, x { ^ e'-J^ : v, and 
(r, x % -> e % )U : e J| n ; (A', x t .-» e> 2i )? =l : i/. 

By inductive hypothesis v = v' and (A,Xj 1— > e' li )" =1 = (A', Xj 1— > e' 2 j)™ =1 . 
It follows easily that A = A'. By the other part of inductive hypothesis, 
IIi =iet Q IIi, which implies II =, eto II'. 

• (seq). Suppose that 

— T : e-y JJ-n 1 : V\ and : e 2 Jj-n 2 A : v 2 and 

- T : ei Jin; ©' : «i and 0' : e 2 Jk' 2 A' : w 2 

By inductive hypothesis, = 0' and IIi =| e t a II'i- Thus, by inductive 
hypothesis, A = A', U 2 =\ e ta TE^, and v 2 = v' 2 . The result is immediate. 

□ 

Lemma 3.10. Suppose that T(x) is a value v x and T : e J,Ln A : v. Then, for 
every heap H in H such that x G dom(E), S(x) = v x . In particular, A(x) = v x . 

Proof. Let L : e JJ-n A : v where T(x) is a value v x . We proceed by induction on 
II based on its hnal rule: 

• (var y ) for y other than x. H takes the form 



■ w 

r' : e y JJ, A' : v y 



(V, y^e y ):y$ (A', y ^ v y ) : v t 



(var 3 
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where (T',y \—*e t/ ) = r and (A', y i— » v y ) = A. Given that T'(x) = T{x) = 
v x , by inductive hypothesis, for every heap S in II' such that x G dom(E), 
E(x) = v x , and in particular A'(x) = v x . The result follows by noting that 
A(x) = A'(x). 

There is nothing to prove for the cases (lam) (because T = A) and (var x ) 
(because x (fc dom(E) for every H in II except T and A). The case (let) is similar 
to (var y ). The cases (app) and (seq) follow immediately by transitivity. □ 

Lemma 3.11. Suppose that T : e JJ A : v and T(x) ^ A(x) for some x G dom(T), 
Then, A(x) is a value. 

Proof. Induction on II based on its hnal rule: 

• (var x ). IT takes the form 



: n ' 

V :e x ^A' : v x 



(r', x h^ e x ) : x Jj- (A', x i-> v x ) : v a 
By dehnition, v x = A(x) is a value. 
(var y ) for y other than x. H takes the form 

: n/ 

r' : e y l A' : v y 



(var, 



(r', y !-»■ e y ) : y Jj. (A', i/h^):!) 



(var y ) 



.y 



where (r',y i— > e y ) = T and (A',y i— > w y ) = A. If r(x) 7^ A(x), then 
T'(x) = A'(x) because T(x) = T'(x) and A(x) = A'(x). By inductive 
hypothesis, we conclude that T'(x) is a value. The result follows by noting 
that A'(x) = A(x). 

(app). II takes the form 



; III ; n 2 

r :eJ|0 : Xz.e' Q : e'[y/z] ij, A : v 

T : e y JJ. A : u 

We now consider cases: 



(app). 



- T(x) = e(x). By r(rc) ^ A (re), it follows that Q(x) ^ A (re). By 
inductive hypothesis, A(x) is a value, as expected. 

— T(x) ^ Q(x). By inductive hypothesis, then, 0(x) is a value like i^. 
By Lemma 3.10, A(x) = v x . The result follows. 

There is nothing to prove in the (lam) case (because T = A). The (let) and 
(seq) cases are similar to (var y ) and (app), respectively. □ 
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Chapter 

Notions of Equivalence 



Three plausible notions of equivalence between expressions offer themselves: value 
equivalence ~„, heap equivalence ~^, and strict equivalence ~ s (defined below). 
While it is easy to dehne notions of equivalence, what makes them interesting is 
the theorems we prove about them. We work with ~ s , as the strongest equivalence 
offering the most scope for proving strong properties. Later, we prove the non- 
trivial fact that — in a Launchbury-based system like ours that the (let) rule 
garbage-collects — ~ v coincides with ~ s (Section 8.2). 

4.1 Definitions 

In this section, we define three notions of equivalence between expressions. It is 
noteworthy that, although we use them for our own operational semantics here, 
they are perfectly definable for any Launchbury-based operational semantics. 

Definition 4.1. Define e\ ~„ e 2 by: 

Vr, v. (3Ai. r : e x \ A x : v) O (3A 2 . V : e 2 ^ A 2 : v) 
We call ei and e 2 value equivalent. 

Intuitively e\ ~„ e 2 when e\ and e 2 compute the same final value, given the 
same initial heap. For example: let {x = Xx.x, y = Xx.x} in x y !~„ Xx.x. 

Definition 4.2. Define e\ ~^ e 2 by: 

Vr, A. (3v x . r : ei ^ A : v x ) & (3v 2 . T : e 2 ij, A : v 2 ) 
We call e\ and e 2 heap equivalent. 

Intuitively, e x ~^ e 2 when e x and e 2 compute the same final heaps, given 
the same initial heap; we do not examine the final values, which may differ, 
or the order of evaluation. For example, x seq y ~^ y seq x (this follows by 
Corollary 9.2) and it is easy to show that x seq y ^ v y seq x. 
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Definition 4.3. Define e\ ~ s e 2 by: 

Vr, v, A. (r : ei J| A : v) O (r : e 2 J| A : u) 
LFe ca// ei and e 2 strictly equivalent. 

Intuitively, e x ~ s e 2 when, given the same initial heap, e\ and e 2 compute the 
same final value and final heap. For example (Xx.x)y ~ s y, and x seq y ^ s y. 
Less obviously, x\ seq (x 2 seq £3) ~ s x 2 seq (xi seq X3) (from Theorems 5.1 
and 9.3). 

4.2 Observations about Notions of Equivalence 

Lemma 4.4. ~„ ; ~^, and ~ s are equivalence relations (reflexive, symmetric, 
transitive). 

Proof. We prove the reult for ~ s . The proof is similar for ~^ and ~„. 

• Reflexivity. For any expression e, the following statement is a tautology 
which means that e ~ s e: 

Vr, «, A. (r : e J| A : v) O (r : e J| A : w). 

• Symmetry. Fix expressions e\ and e 2 such that ei ~ s e 2 . By Definition 4.3, 
this means that: 

Vr, v, A. (r : ei J| A : w) O (r : e 2 J| A : v). 

Hence, 

Vr, v, A. (r : e 2 J| A : v) O (r : e x J| A : u). 

Namely, e 2 ~ s ei. 

• Transitivity. Fix expressions e\, e 2 , and e^ such that ei ~ s e 2 and e 2 ~ s e^. 
Furthermore, fix arbitrary To, Ao, and vq for which Fo : e\ JJ- Ao : t>o- By 
Definition 4.3, from e\ ~ s e 2 , we conclude To : e 2 J| Ao : i>o- Therefore, 
again by Definition 4.3, from e 2 ~ s e 3 , we conclude r : e 3 JJ. A : v , as 
required. 

D 

Lemma 4.5. ei ~ s e 2 «;/ and only if e\ ~h e 2 and ei ~„ e 2 . 

Proof. The left-to-right direction is obvious. Suppose that e\ ~ v e 2 and ex ~k e 2 . 
Fix arbitrary r, Ax, and V\ such that 

r:eiJ|Ai:«i. (4.1) 

Because ei ~„ e 2 , Definition 4.1 and (4.1) imply that there exists a heap A 2 such 
that 

T : e 2 J| A 2 : v x . (4.2) 
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On the other hand, because e\ ~h e 2 , Definition 4.1 and (4.1) imply that there 
exists a value i> 2 such that 

r : e 2 JJ. Ai : « 2 . (4.3) 

Finally, because our operational semantics is deterministic (Theorem 3.9), (4.2) 
and (4.3) imply that v\ = i> 2 and Ai = A 2 . □ 

Theorem 4.6 formalises an intuition that seq evaluates its first argument, then 
'throws it away and keeps the heap': 

Theorem 4.6. e x ~/j e[ and e 2 ~ s e' 2 imply e\ seq e 2 ~ s e^ seq e 2 . 

Proof. Suppose T : e-y seq e 2 J| n A : v. II takes the form 



r : ei J| : «i 9 : e 2 J| A : w 

(seq). 

T : ei seq e 2 JJ. A : v 

By assumption ei ~h e[, so T : e^ JJ. : v[ for some v[. By assumption e 2 ~ s e' 2 , 
so : e 2 JJ. A : v. We construct a derivation as follows 



r : ei 4J. 6 : v [ 6 : e 2 4J. A : u 

(seq). 

T : e^ seq e 2 JJ- A : v 

The result follows by symmetry. □ 

A converse to Theorem 4.6 is false; see Corollary 5.6. 

~ s is a congruence with respect to seq and to application: 

Theorem 4.7. 1. e x ~ s e[ and e 2 ~ s e 2 imply e\ seq e[ ~ s e 2 seq e 2 . 

#. e ~ s e' implies e x ~ s e'x. 

5. e ~ s e' implies 

let {^ = e*}? =1 in e ~ s let {x; = e;}™ =1 in e'. 
Proof. 1. From Lemma 4.5 and Theorem 4.6. 

2. Suppose r : e x JJ. n A : v. H takes the form 



: iii : n 2 

r : e JJ, : Xz.e" : e"[x/z] 4J. A : v 
r : e a; JJ. A : v 



(app). 



By assumption T : e' JJ. : Xz.e". We combine this with II 2 to observe that 
r : e' x JJ. A : w as required. 
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3. Suppose e ~ s e'. Suppose T : let {xi = ei}™ =1 in e |n A : «. II takes the 
form 

: n i 



T : \e\.{x i =e i } 7 } =l \ne JJ. A : u 
Because e ~ s e', there exists a 11^ such that: 

(T,x^e^ =1 : e' i^ (A, Xi ^e^ =1 : v 



(let). 



(let). 



T : \et{xi=ei}™ =1 \ne' 1}, A : w 

The result follows. D 

Remark 4.8. We hypothesise that also, e ~ s e' and e, ~ s e^ for 1 < i < n 
implies 

let {xi = e;}™ =1 in e ~ s let {x { = e^}™ =1 in e'. 

The proof would probably use Theorem 8.11. Considering this is future work. 

Note that ~ s is not a congruence for A. For example, x seq x ~ s x (Theo- 
rem 5.5) and Ax.(x seq x) ^ s Ax.x. This is because ~ s is intensional on values; 
v ~ s v 1 when v and t/ are equal up to a-equivalence. 

In fact, this has a semantic analogue in the use of valuations in denotationally- 
based techniques like that of [ ■ ]; valuations make terms 'closed', by assigning 
meanings to all their free variables. It is possible to consider extensional no- 
tions of operational equivalence (see [ ] for an overview), and this is for future 
work. However, our intensional equivalences are already interesting, and indeed, 
they probably provide a less cluttered canvas with which to illustrate the proof- 
methods which we develop for seq. 
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Chapter 

Elementary Properties of seq 



This chapter presents some elementary properties of seq. Although they are 
elementary, perhaps they can still be quite surprising. 

seq is called 'sequential composition', so we start by proving a typical sequen- 
tial property, that seq is associative (Theorem 5.1). In Remark 5.2 we discuss a 
subtlety of the interpretation of seq, that the intuition of ex seq e 2 as 'ex then 
e 2 ' can mislead: e\ can trigger some (or even all) of the evaluations in e 2 , so 
that e 2 is still evaluated 'first'. Finally, we prove that seq is idempotent (Theo- 
rem 5.5). The result we give in this chapter is restricted to variable symbols; the 
proof for general expressions is rather harder, and comes later on in Chapter 10. 
Theorem 5.5 is still useful, to establish a counterexample in Corollary 5.6. 

Theorem 5.1 (Associativity). 

e 1 seq (e 2 seq e 3 ) ~ s (ex seq e 2 ) seq e 3 

Proof. Suppose Y : e-y seq (e 2 seq e 3 ) JJ. n A : v 3 . U takes the form 

\ n 2 j n 3 

: tt : e 2 JJ. 5 : v 2 3 : e 3 JJ. A : v 3 

■ l (seq) 

T : e 1 JJ. : v t : e 2 seq e 3 JJ. A : v 3 

(seq). 

T : ei seq (e 2 seq e 3 ) JJ. A : v 3 

We rearrange this as follows: 

i n a j n 2 

r : ei JJ- : v 1 : e 2 JJ, 5 : v 2 ; n 

(seq) _ • A 

r : ei seq e 2 JJ. 5 : w 2 5 : e 3 JJ. A : v 3 

(seq). 

T : (ei seq e 2 ) seq e 3 JJ. A : w 3 

So T : (ei seq e 2 ) seq e 3 JJ. n A : v 3 . The reverse implication is similar. □ 

Remark 5.2. seq is so called because it is 'sequential'. So, one may think of 
e\ seq e 2 as 'calculate ex, then e 2 '. But, there are some subtleties. For example, 
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it is a fact that 

(x i— > (Xz.z)y, y \— ► let 2: = Az.z in 22) : x seq ?/ JJ-n 

(x 1— > \z.z,y \—> Xz.z) : Az.z. 

The reader can verify that inside II, the evaluation of let z = Xz.z in zz, bound 
to y, is carried out in some sense 'before' that of (Xz.z)y, bound to x. 

ei seq e 2 means what (seq) in Definition 3.4 says that it means: "evaluate e\ 
first, then evaluate e 2 in the resulting heap". It could be that, after evaluating e l5 
every evaluation that would have been triggered by evaluating e 2 in the original 
heap (that is, by evaluating e 2 first) has already been carried out. We return to 
this in the discussion opening Chapter 7. 

It is easy to prove (a restricted form of) idempotence of seq up to ~ s . Lem- 
mas 5.3 and 5.4 are convenient: 

Lemma 5.3. IfT : x JJ. A : v is derivable then A(x) = v. 

Proof. Suppose V : x i^u A : v. U must take the following form where (T',x 1— > 
e) = T and (A', x i-> v) = A 

: n ' 

r' : e ^ A' : v 

(var x ). 

(r', x 1— > e) : x JJ, (A', x \-^> v) : v 

Obviously, A(x) = (A',x 1— > v)(x) = v. D 

Lemma 5.4. IfT(x) = v then T : x JJ. T : v. 

Proof. Let V be such that L = (L', x 1— >■ u). Then, 

(lam) 

r' : u 4J. r' : u 

(var x ) 

T :x JJ,T : v 

is a valid derivation, and the result follows. □ 

Theorem 5.5. x seq x ~ s x. 

Proof. Suppose T : x seq x JJ. n A : v. H takes the form 



: iii \ n 2 

r : x JJ, 6 : v' e:xJ|A:v 
T : x seq x JJ. A : v 



(seq). 



By Lemma 5.3 0(x) = v'. By Lemma 5.4 : x JJ. : v'. By Theorem 3.9 it 
follows that A = and v' = v. So Hi is a derivation of L : x JJ. A : w. D 
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It is also true that e seq e ~ s e, but the proof is harder. We need to know 
that r : e JJ. A : v implies A : e JJ. A : v, that is, re-evaluation of e does not change 
the heap further (and returns the same value). See Theorems 10.1 and 10.2. 

Theorem 5.5 is enough to prove a converse to Theorem 4.6 false: 

Corollary 5.6. Suppose that e[ seq e\ ~ s e' 2 seq e^. Then e\ ~ s e^ does not 
imply e[ ~ h e' 2 . 

Proof. It suffices to provide a counterexample. By associativity (Theorem 5.1) 
(x seq z) seq z ~ s x seq (z seq z). By Theorem 5.5 and part 1 of Theorem 4.7 
x seq (z seq z) ~ s x seq z. By Lemma 4.4, (x seq z) seq z ~ s x seq z. Also, by 
Lemma 4.4 z ~ s z. It is not hard to verify that x seq z ^ x. □ 



Chapter 
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Further Properties of Launchbury's 
Semantics 



The operational semantics from Definition 3.4 is inherited from [55, 106], with 
slight modifications. In Chapter 7, we apply some new proof-techniques to that 
semantics, as discussed in the Introduction. Before we do so, however, it will be 
useful to make some fundamental observations about the operational semantics. 
As far as we know, the only place in which these have previously appeared in the 
literature is in [71]. This thesis borrows largely from [ ], where for this particular 
chapter, all the proofs of the former work are completed. 

We distinguish between uses of a variable that cause evaluation and change 
the heap and those that do not and 'look up ' a value, without changing the heap. 
This is the distinction between non-trivial and trivial instances of (var x ) in Def- 
inition 6.1. Thus, in Lemma 6.7, we establish that TI contains a non-trivial 
instance of (var x )' and TI changes the expression bound to x\ are equivalent. 
We write diff(U) for the set of these variables (Definition 6.4). This notion is 
important: in Chapters 8 and 9 we will obtain our central results by induction, 
not on derivations II, but by induction on the sets diff(Jl). 

The operational semantics is intended to capture call by need evaluation. 
Thus, we expect that evaluation is 'shared' and that expressions should be eval- 
uated at most once. This intuition is captured by Theorem 6.9. Finally, the only 
variables which 'matter' to 14 are the variables x such that (var x ) appears in 14; 
other x do not matter. Lemmas 6.15, 6.22, and Theorem 6.23 make that formal. 

This chapter is divided into two sections: Section 6.1 explores the differences 
between instances of (var x ). Section 6.2 classifies the essential parts of a heap 
for a derivation. Although many of the proofs in this chapter are long, they are 
mostly routine inductions. Exceptions are Theorem 6.9, and Theorem 6.23. 

6.1 Trivial/Non-trivial Instances 

In this section, we divide (var x ) instances into two groups: Trivial and Non- 
Trivial (Definition 6.1). We then formalise our intuition that trivial instances do 
not change the heap (Lemma 6.5) whilst non-trivial ones do (Lemma 6.7). These 
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results are then used to prove that each derivation contains at most one (var x ) 
instance (Theorem 6.9). 

Definition 6.1. Suppose T : e JJ-n A : v. Write V(U) for the set of x E dom(T) 
such that II contains an instance of (var x ). 

Suppose II contains an instance o/(var x ) ; as illustrated: 



r' : e x $ r' : v 2 



(r', x ^ e x ) : x JJ. (r', x i-> v x ) : v 3 



(var, 



CaZZ i/ie instance trivial when II' consists of a single instance of (lam) (equiva- 
lently, when e x = v x ). Call the instance non-trivial otherwise. 

Lemma 6.2. Suppose IL = leta il 2 . T/ien, F(IIi) = V(n 2 ). 

Proof. Immediate from Notation 3.8 and Definition 6.1. □ 

Lemma 6.3. Let x G dom(T) and V : e J| n A : v. Then, H contains no non- 
trivial instance of (var x ) if and only ifT(x) = A(x). 

Proof. Let T : e JJ-n A : w such that II contains no non-trivial instance of (var x ). 
We proceed by induction on II based on its final rule: 

• (lam). II takes the form 

(lam) 

T : Xx.e JJ. T : Xx.e 

in which case the result is trivially correct because T = A and II contains 
no non-trivial instance of (var x ). 

• (VaiO. II takes the form 



r' : e x $ A' : v x 



(r 7 , x h^ e x ) : x J| (A', x ^ v x ) : v a 



(var x ) 



where (T',x \—> e x ) = T and (A',x \—> v x ) = A. Given that x (£ dom(T'), 
this is the only (var x ) instance of II. And, in order for this very instance to 
be trivial (Definition 6.1), we must have e x = v x , as desired. On the other 
hand, if e x = v x , then (by Definition 6.1) this is not a non-trivial instance 
of (var x ). 

(var y ) for y other than x. H takes the form 



■n' 

V : e y ty A' : v y 



(r', y !-»■ e y ) : y Jj. (A', y ^ v y ) : «, 



(var y ) 
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where (T', y \—> e y ) = T and (A', y \—> v y ) = A. By assumption, II contains 
no non-trivial instances of (var x ). Therefore, II' contains no non-trivial 
instances of (var x ) either. By inductive hypothesis then T'(x) = A' (re). 
The result follows by noting that T(x) = T'(x) and A (re) = A' (re). 

On the other hand, if T(x) = A (re), then r'(re) = A'(x). By inductive 
hypothesis, II' contains no non-trivial instance of (var x ). Thus, II contains 
no non-trivial instance of (var x ) either. 

(app). By inductive hypothesis. 

(let). II takes the form 

■ w 

(r,x^e l )r =1 :e^(A,re l ^e^ =1 :^ /i . 

(let). 

T : let{xi=ei}" =1 ine r[L A : v 

Given that there are no non-trivial instances of (var x ) in II, no non-trivial 
instances of (var x ) exist in II' either. By inductive hypothesis then (T, Xj i— > 
e i)l=i( x ) = (A,X; l_ ^ e i)7=i( x )- The result follows by noting that (T, Xi i— > 
e i)r=i( 2 -) = ^(x) and (A,rci i— > eQ(x) = A(x). 

On the other hand, if T(x) = A(x), then (r,Xj i— > e i ) r l =l (x) = (A,Xj i— > 
e i)r=i( x )- By inductive hypothesis, II' contains no non-trivial instance of 
(var x ). Thus, II contains no non-trivial instance of (var x ) either. 



□ 



• (seq). By inductive hypothesis. 

Definition 6.4. Suppose T : e 4J-n A : v. Define diffili) by: 
diff(U) = {x e dom(T) | T(x) ^ A(x)} 
Lemma 6.5. Suppose T : e JJ-n A : v. Then the following are equivalent: 

(1) x G V(U) andx £ diff (II). 

(2) x G ^(11) and T(x) is a value (and since A(x) = T(x), so is A(x)). 

Proof. By induction on II based on its hnal rule. For each case, we prove that 
(1)^(2): 

• (lam). II takes the form 

(lam). 

T : Ax.e JJ- T : Ax.e 

In this case, both directions are correct because V(H) = 0. 
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(var x ). II takes the form 

: n 

r : x JJ. A : v x . 

If x ^ diff(TL), by Definition 6.4, r(x) = A(x) = v x , i.e., r(x) is a value. 
(Note that, by construction of our (var x ) rule, we know that A = (A', x \— > 
u^) for some heap A'.) Suppose that Y{x) is a value like v. On one hand, by 
Lemma 3.10, A(x) = v too. On the other hand, by Lemma 5.3, A(x) = v x . 
Therefore, by Determinism (Theorem 3.9), v = v x , and x <£ diff(H). 

(var y ) for y other than x. H takes the form 









r' 


: n' 

: e y $ A' : 


^ 




(r', 

e y) 


y 


i — > 
r 


e y ) : 
and 


y^(A',y 

(A',y h-> 


i — > 


u„) : w y 

= A. 



( var y) 



where (T',y ^ e y ) = Y and (A',y ^ u„) = A. If x G 7(n) \ dif (n), 
then x G V"(II') \ diff(H') too. By inductive hypothesis, Y'(x) is a value. 
Therefore, Y(x) is a value because Y(x) = Y'(x). On the other hand, if 
Y(x) is a value, r'(x) is also a value because L(x) = Y'(x). By inductive 
hypothesis, x G V(II') \ diff(Yl'), i.e., L'(x) = A'(x). Thus, L(x) = A(x) 
(because A(x) = A'(x)) and the result follows from Definition 6.4. 

(app). II takes the form 



: IT ■ n 2 

Y :eij,Q : Xz.e' : e'[y/z] ty A : v 



(app). 



T : e y J| A : v 
Suppose that x G V(H) \ diff(Yl). There are two possibilities: 

— x G F(rii). We claim that x £ diff(IY) implies x ^ diff (JX\) and the 
result follows by inductive hypothesis. This claim can also be proved 
easily. Suppose that x G diff (Hi). By Lemma 3.11, then, 0(x) is a 
value v x which, by Lemma 3.10, implies that A(x) = v x too. This 
implies x G diff(IY) which is contradictory. 

— x ^ F(IIi). Therefore, x G F(n 2 ), and, by Lemma 6.3, r(x) = B(x). 
Hence, x (£ diff (H2) because otherwise r(x) = 0(x) 7^ A(x), namely, 
x G diff(U) which is contradictory. Thus, by inductive hypothesis, 
0(x) is a value. The result follows because r(x) = 0(x). 

Suppose on the other hand that x G V(n) and r(x) is a value v. By 
Lemma 3.10, A(x) = v. Accordingly, T(x) = A(x), i.e., x ^ diff(U). 
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• (let). II takes the form 

: n ' 

(r, Xi .-> ei )? =1 : e $ (A, x t .-> e'^ =1 : v 

(let). 

T : \et{xi=ei}i =1 \ne JJ. A : v 

Suppose that x G 7(11) \ diff(IL). This implies that x G 7 (IT) \ diff (TV). 
By inductive hypothesis, then, (r, Xi i— > ej)™ =1 is a value. The result follows 
by noting that r(rc) = (T,Xi \—> ei)f =l (x). 

Suppose on the other hand that x G 7(11) and V(x) is a value. There- 
fore, x G 7(17) and (L,£j i— ► ei)" =1 (x) is a value (because (T,Xi i— > 
e i )" =1 (x) = r(x)). By inductive hypothesis, cc G 7 (IT) \ diff(U'), i.e., 
(A, a;, i— > e^^Li^) = (r,Xi i— > ei)™ =l (x). Consequently, T(x) = A(x) and 
the result follows. 

• (seq). II takes the form 



: iii ; n 2 

r : ei i : v x : e 2 jj. A : v 2 



(seq). 



r : e 1 seq e 2 JJ- A : v 2 
Suppose that x G 7(11) \ diff(U). There are two possibilities: 

— x G 7 (Iii). We claim that x £ diff (If) implies x ^ diff (Hi) and the 
result follows by inductive hypothesis. This claim can also be proved 
easily. Suppose that x G diff(Ui). By Lemma 3.11, then, Q(x) is a 
value v x which, by Lemma 3.10, implies that A(x) = v x too. This 
implies x G diff(U) which is contradictory. 

— x (£ 7 (Iii). Therefore, x G 7(112), and, by Lemma 6.3, T(x) = Q(x). 
Hence, x (£ diff(U 2 ) because otherwise Y(x) = Q(x) ^ A(x), namely, 
x G diff (II) which is contradictory. Thus, by inductive hypothesis, 
Q(x) is a value. The result follows because T(x) = Q(x). 

Suppose on the other hand that x G 7(11) and T(x) is a value v. By 
Lemma 3.10, A(x) = v. Accordingly, T(x) = A(x), i.e., x £ diff (II). 

□ 

Corollary 6.6. Suppose L : e\ 4J-ni Ai : v\ and Y : e 2 J|n 2 A 2 : v 2 such that 
7(n x ) = 7(LT 2 ). Then, diff (Hi) = diff(TL 2 ). 

Proof. Suppose L : e\ J^m Ai : Vi and L : e 2 JJ. n2 A 2 : v 2 such that 7(11!) = 
7(II 2 ). Let x G diff (Hi). If we can show that x G diff(U 2 ), the result follows by 
symmetry. Suppose on the contrary that x ^ diff(H 2 ). Then, because x G 7(II 2 ), 
by Lemma 6.5, T(x) is a value. On the other hand, because x G 7(IIi), by 
Lemma 6.5, this means that x £ diff (Hi) — which is contradictory. □ 
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Lemma 6.7. Suppose L : e JJ-n A : v. Then, for any x G dom(T), the following 
are equivalent: 

(1) x G diff(IL). 

(2) T(x) is not a value and A(x) is a value. 

(3) II contains a non-trivial instance o/(var x ). 

Proof. Suppose r : e JJ-n A : v. We will prove that, for any x G dom(T), 
(1) =* (2) => (3) => (1). 

(1) =4> (2). Suppose x G diff(U). By construction (Definition 6.4) this means 

that T(x) ^ A(x). Therefore, by Lemma 3.11, A(x) is a value. On the 
other hand, T(x) cannot be a value, because otherwise, by Lemma 3.10, 
T(x) = A(x) which is contradictory. 

(2) =5- (3). Contrapositive of (the =5- direction in) Lemma 6.3. 

(3) =5- (1). We prove the contrapositive. If x ^ diff(IL), by construction (Def- 

inition 6.4), T(x) = A(x). By Lemma 6.3, therefore, II contains no non- 
trivial instance of (var x ), as desired. 

□ 

Corollary 6.8. Suppose T : e J| n A : v where x G V(n). Then, A(x) is a value. 

Proof. Suppose L : e JJ-n A : v where x G V(n). There are two possibilities: 

• If x G diff(JI), then, by Lemma 6.7, A(x) is a value. 

• If x $_ diff(H), then, by Lemma 6.5, A(x) is a value. 

□ 
We can now move on to some (slightly) less obvious results: 

Theorem 6.9. Suppose T : e JJ-n A : v. For each x G dom(T), II contains at 
most one non-trivial instance o/(var x ). 

Proof. Induction on II based on the final rule in it: 

• (lam). II takes the form 

(lam). 

L : Xx.e J| L : Xx.e 

The result is correct because II contains no (var x ) instances. 
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(var x ). II takes the form 



■ w 

V' : e x JJ, A' : v a 



(r', x ^ e x ) : x JJ. (A', x h^ v x ) : v a 



(var x ). 



By construction x ^ dom(T'). Using our well-formedness assumption that 
let-bound variables are introduced distinct (Remark 3.5), it is easy to verify 
that x g V(W). The result follows. 

(var y ) for y other than x. H takes the form 



V : e y ^ A' : v y 



(V, y^e y ):y^ (A', y i-> Vy ) : w. 



(var y ) 



where (F',y i— > e y ) = F and (A',y i— > w y ) = A. By inductive hypothesis, 
II' contains at most once instance of (var x ). The result follows because II 
only adds a (var y ) to II'. 

(app). II takes the form 



; II! ; n 2 

r :eJ|0 : Xz.e' : e'[y/z] ty A : v 

T : e y JJ. A : v 
We now consider cases: 



(app) 



— The case x $ diff(Hi). By Lemma 6.7, IT contains no non-trivial 
instance of (var x ). By inductive hypothesis, II 2 contains at most one 
non-trivial instance of (var x ). The result follows. 

— The case x G diff (n x ). By inductive hypothesis, H 1 contains at most 
one non-trivial instance of (var x ). By Lemma 6.7, Q(x) is a value. 
Furthermore, by Lemma 3.10, A(x) is a value too. By Lemma 6.7, 
again II2 contains no non-trivial instance of (var x ). The result follows. 

(let). II takes the form 

■ w 

(r, Xi » e,)™ =1 : e 4J. (A, Xi .-> e^ =1 : v 

(let). 

T : \et{xi=ei}™ =1 \ne JJ. A : v 

By inductive hypothesis, II' consists at most on instance of (var x ). The 
same is visibly the case for II too because it only augments II' by a (let). 
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(seq). II takes the form 

\ iii \ n 2 

r : ei i 9 : v x : e 2 jj, A : w 2 
r : ei seq e 2 JJ- A : i> 2 

There are two cases to consider: 



(seq). 



— The case x G - diff(Ui). By Lemma 6.7, IT contains no non-trivial 
instance of (var x ). By inductive hypothesis, II 2 contains at most one 
non-trivial instance of (var x ). The result follows. 

— The case x G diff(Tli). By inductive hypothesis, LTi contains at most 
one non-trivial instance of (var x ). By Lemma 6.7, Q(x) is a value. 
Furthermore, by Lemma 3.10, A(x) is a value too. By Lemma 6.7, 
again II 2 contains no non-trivial instance of (var x ). The result follows. 

□ 

Lemma 6.10. Suppose F : x J|n x A : v x and diff(H x ) = 0. Then, T(x) = v x . 

Proof. By definition (Definition 6.4), diff(U x ) = implies F = A. On the other 
hand, by Lemma 5.3 we have A(x) = v x . □ 

6.2 Essential Parts of a Heap for a Derivation 

The mission of this section is to provide criteria for distinguishing the essential 
parts of a heap for a derivation. That is, the parts of a heap T without which a 
derivation T : e JJ. A : v is not derivable. Theorem 6.23 formalises this. 

Definition 6.11. For an x G dom(T), define T[x i— > e] by: 

F[ii-> e](x) = e 

T[x \-^ e](y) = T(y) y G dom{Y) 

F[xi-> e](y) undefined otherwise. 

We extend this notation as 

T[xt ^ ei]} =1 = T[x 1 ^ ej 

TJxi ^ e^ =1 = (T[xi ^ ei]™Zi)[x n i-> e n ]. 

Suppose r : e JJ-n A : v and suppose II contains no instance of (var x ). Then for 
any e 1 define H[x \— > e'] to 6e t/ie labelled tree obtained from II fey replacing every 
heap appearing in li with Q[x \—> e'] if x G dom(Q); otherwise, we leave 

Lemma 6.12. Suppose T(x) = e x . Then, T[x \—> e x ] = T. 
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Proof. By Definition 6.11, 

T[x ^ e x ](x) = e x = T(x). 

□ 
Lemma 6.13. Suppose T : e JJ-n A : i> and r(x) zs a wa/ue w x . Then, TL[x t— > 

«*] = n. 

Proof. By Lemma 3.10, for every heap S in II such that x G dom(E), E(x) = v x . 
We use Lemma 6.12 to conclude the desired result. □ 

Lemma 6.14. For any heap Y and variable y such that y ^ dom(T) 

(T,y i-> e y )[x ^ e x ] = (T[x ^ e x ],y ^ e y ). 

Proof. Using Definition 3.3 and Definition 6.11, it is trivial to verify that: 

(I\3/i-> e y )[x ^ e x ](y) = e y = (T[x ^ e x ],y ^ e y )(y) 
(T,y ^ e y )[x ^ e x ](x) = e x = (T[x ^ ej.j/n e y )(x) 

and for any z other than x and y 

(T,y !-»■ e y )[x ^ e x ](z) = T(z) = (Y[x ^ e x ],j/i-> e y )(z). 

□ 

Lemma 6.15. Suppose Y : e J|n A : v. Suppose x G dom(II) but x ^ U(II). 
TTien for any e x , 

T[x^e x ] : e Jk^e,] A[xi->eJ : w 

diff(U[x^e x ]) = diff(U) 

V(U[x^e x }) = 7(11). 

Proof of Lemma 6.15 comes in Appendix A.l. 

Definition 6.16. Suppose T is a heap and S is a set of variables. Write T\g for 
the function defined by: 

• T\g(x) = T(x) ifT(x) is defined and x G S. 

• L|s(a;) is undefined otherwise. 

We call this V restricted to S. 

Symmetrically, write T\S for the function defined by: 

• (r \ S)(x) = T(x) ifY{x) is defined and x ^ S. 

• L \ S is undefined otherwise. 

Suppose L : e J|n A : w and suppose V(U) C S. Write U\s for the labelled 
tree obtained by replacing every heap E occurring in U, with E \ S' where S' = 
dom(T)\S. We call this H restricted to S. Note that — because, by Lemma 3.7, 
dom(Y) = dom(A) — there is no difference between T\s and V \S' or likewise 
between A|s and A \ S'. Having this said, we prefer to write T\s '■ e JJ-n| s A|s : v. 
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Lemma 6.17. Suppose that x (£ dom(T) and I : e JJ-n A : v. Then, (T, x i— > e x ) : 

e JJ. n + (A, x i— > e x ) : w where 

n = n + | dom ( r ) 

7(n+) = 7(n) 
d^(n+) = dijf(ii). 

Proof of Lemma 6.17 comes in Appendix A. 2. 

Remark 6.18. In words, II + in Lemma 6.17 is a copy of II in which the binding 
x i— ► e x is added to every heap. It is possible but tedious to formally describe this. 
As the only application is in Lemma 6.17, we omit a formal treatment however. 

Corollary 6.19. Suppose 1CI' and I : e JJ. n A : v. Then, there exists a heap 
A' such that I' : e JJ. n ' A' : v where 



A = A' I 



domiT) 



n — n \dom(r) 

7 (IT) = 7(11) 
diff{W) = diff {II). 

Proof. Let r'\r = {xj i— > ei}™ =1 . The result follows by a straightforward induction 
on n and using Lemma 6.17. □ 

Lemma 6.20. Suppose (I~,x \— > e x ) : e J|n (A~,x i— > e' x ) : v and x ^ 7(11). 
Then, I~ : e JJ. n - A~ : v where 

n~ = R\dom(T-) 

v(u-) = v(n) 

diff (IT) = diff (II). 

Proof. Similar to proof of Lemma 6.17. D 

Corollary 6.21. Suppose I' C I and I : e JJ. n A : v such that dom(I \ I') n 
7(11) = 0. Then, there exists a heap A' such that I' : e JJ-n' A' : u where 

A' = A| dom ( r /) 

II = II| ( iom(r') 

7QT) = 7(n) 
diff {ii') = diff (n). 

Proof. Similar to proof of Corollary 6.19. □ 

Lemma 6.22. The following statements are correct: 

• If I : e JJ. n A : v and 7(11) C 5, tfien T| s : e J| n | s A| s : v. 

• If I\s '■ e J|n| s A|s : v and T(y) = A(y) for every y G dom(I) \ S, then 
I : e Jin A : v. 
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Proof. Suppose L : e JJ-n A : v and V(U) C S* for some 5. By definition 
(Definition 6.16) 

dom(r \ r| s ) n v(u) c dom(r \ r| s ) n 5 = (dom(r) \ 5) n s 1 = 0. 

Thus, by Corollary 6.21, T\ s : e J| n | s A| s : v. 

Suppose on the other hand that T\s ■ e J|n| s A|s : v such that T(y) = A(y) for 
every j/G S" = dom(T) \ S. By an induction on size of 5" and using Lemma 6.17, 
we conclude L : e JJ. n A : v, as expected. □ 

Theorem 6.23 expresses that if L : e JJ-n A : u, then II only depends on the T 
restricted to V(n): 

Theorem 6.23. Suppose T : e JJ-n A : v and suppose T(z) = T'(z) for every 
z G V(II). Then there exist A' and U' such that: 

T':e^A':v diff (IT) = diff (II) 

V(W) = V(U) A'(z) = A(z) for every z G 7(11). 

Proof. Suppose L : e JJ-n A : v. Let L = T\v(n) an d A = A\v(n)- By Corol- 
lary 6.21, I 1 : e ^n[ v(ir) A : v where 

7(n| v(n) )= V(U) (6.1) 

diff(Tl\ v(u) ) = diff(Il) (6.2) 

On the other hand, by Corollary 6.19, V : e JJ-n' A' : v where 

v(n\ v(n) )= v(n') (6.3) 

diff(IL\ m) ) = diff(IL') (6.4) 

Hence, 7(14) = V(W) by (6.1) and (6.3), and diff (II) = diff (W) by (6.2) and 
(6.4). The result follows by noting that A'|y(n) = A = A|ym)- D 
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Chapter 

Heaps and Atomic Variables 



Now that we have identified trivial and non-trivial (var x ) instances, and identified 
the essential parts of a heap for a derivation, we can move on to consider a notion 
of unit of change. Intuitively, this is computations which do cause changes in 
a heap, but cause the minimum change. For this, we particularly underpin the 
variables with such a property — which we call atomic. Definition 7.2 makes this 
formal, and the rest of this chapter explores the properties of atomic variables. 
But, first, we need some introduction: 
Consider these heaps: 

r i = (y i-> x, x ^ y) 

^2 = (y | — *■ x, x i— >■ let z = Xz.z in zz) 

F3 = (y <— ► x, x i— > Xz.z) 

These impose evaluation dependencies on variables: for r l5 to evaluate the ex- 
pression bound to y we must evaluate that bound to x, and vice- versa; for T 2 , 
to evaluate the expression bound to y we must evaluate that bound to x, but 
to evaluate the expression bound to x we need evaluate no other parts of the 
heap; for Ts, to evaluate the expression bound to y we do not need to evaluate 
the expression bound to x, because it is already evaluated. This leads us to the 
following definition: 

Definition 7.1. Suppose T is a heap. Define the (evaluation) dependency 
order by: 

x r^r y when T : y JJ-n A y : v y and x G diff(U y ) 

Suppose now that T : e JJ-n A : v. It is a fact that in this case, ^r is acyclic 
on diffill). Intuitively this is because otherwise II would 'go on forever' and 
therefore could not exist; consider for example trying to construct some II of the 
form Ti : y -JJ-n A : v. 1 

In this section we develop part of the theory of the dependency order. Defi- 
nition 7.2 (atomic variables) characterises variables that are least, in the depen- 
dency order, and represent units of evaluation that update exactly one variable in 



X A slight subtlety: x <y 2 Vi but as we have no theory of partial derivations, x ^ri V- As 
discussed, x ^p 3 V-, because T 3 (x) = Xz.z is already evaluated. 
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the heap. Theorems 7.10 and 7.11 remove least elements from and add them to 
the initial heap of an evaluation; in effect, 'pruning' and 'restoring' least elements 
from ^r- Finally, Theorem 7.12 establishes a useful and important part of the 
fundamental observation above — that for any II, there must be some atomic 
variable in diff(Yl), whenever diff(YY) ^ 0. This is all that we will need for the 
results to follow. 

^r is implicit in what follows but we do not need all of it. We just need 
that least elements always exist, and that the number of variables mentioned in 
a derivation is finite; therefore, we shall only need to talk about atomic variables. 
We leave a full treatment of the dependency order to future work. 

Definition 7.2. Call x atomic in Y when there exist A x , v x , and U x such that 
Y : x i^n x A x : v x and diff(U x ) = {x}. Write atomic(Y) for the set of atomic 
variable symbols in Y. 

So x is atomic in Y when it can be calculated without affecting the rest of 
the heap (note that if Y(x) is a value then x is not atomic in Y; we only measure 
non-trivial evaluation in the evaluation dependency order). This notion, as it 
turns out, is extremely useful in the proofs to follow. Note that x is atomic with 
respect to a heap — not with respect to any particular evaluation r : e JJ- n A : w. 

We will prefer the following slightly more succinct characterisation of atom- 
icity: 

Lemma 7.3. x G atomic(Y) if and only if there exist v x and Yi x such that Y : 
x JJ.^ Y[x ^ v x ] : v x and diff(IL x ) = {x}. 

Proof. (<=) Suppose Y : x JJ-n x Y[x i— > v x ] : v x and diff (Yi x ) = {x}. By Defini- 
tion 7.2, taking A^ = Y[x \— > v x ] suffices to show that x is atomic in Y. 

(=^>) Suppose on the other hand that x is atomic in Y. By Definition 7.2, 
there exists A x , v x , and H x such that Y : x J|n x A x : v x and diff(H x ) = {x}. 
Because diff(H x ) = {x}, by Definition 6.4, the only variable y for which Y(y) ^ 
A(y) is x. Furthermore, by Lemma 5.3, A x {x) = v x . Consequently, A x = Y[x i— > 
v x ], as required. □ 

Notation 7.4. Hereafter, "_" will represent the wildcard in our notation, so that, 
by Y : e JJ. A : _, we are clarifying our lack of interest in the final value obtained 
after evaluating e in Y. Write Y : e J|n for Y : e JJ-n - : - 

Definition 7.5 is related to Definition 6.11, as is the notation used, but the 
constructions have different preconditions and distinct properties. 

Definition 7.5. Suppose Y : e JJ-n A : v such that x G ^(11). Suppose also that 
x G atomic{Y), so that in particular Y : x J|n x A x : v x for some H x , A x , and v x . 

We define a labelled tree Yl[x \— > v x ] by inductively transforming II based on 
its final rule. For each case, when x ^ dom(Y), define H[x \—> v x ] = II. Cases 
when x G dom(Y) are explained below: 

• (lam). II takes the form 

(lam). 

T : Xx.e JJ- T : Xx.e 
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Define Tl[x \— > v x ] to be 

(lam). 

T[x i— > w x ] : Xx.e JJ. T[x i— > i> x ] : Ax.e 

• (var x ). II tefces the form 



r' : e, J| r' : u a 

var x ). 



(r', x i-> e x ) : x JJ. (r', ihDj):!) 
Define Ii[x \— > v x ] to be 

(lam) 

r' : v x $ V : v x 

(var x ). 



(r', x i-> v x ) : x J| (r', a; i-> u x ) : u 
(let) and (var y ) /or y oi/ier than x. U takes the form 

: n ' 

r' : e' JL A' : v' 

(r) 

T :ej| A : u 

where (r) G {(var y ), (let)}. Define U[x \—> v x ] to be 

;n'[iH a x ] 
r'[x h^ <u x ] : e' JJ. A'[x ^ v x ] : w' 

T[x i— > u x ] : e JJ. A[x i— > w x ] : v 
(app) and (seq). II takes the form 

\ Hi \ n 2 

r : e x jj 9 : u x : e 2 jj. A : w 2 

(r) 



>)■ 



T : ej| A : w 
where (r) G {(app), (seq)}. Define H[x i— > u x ] to 6e 

; IlifiC h-> uj ; n 2 [x h^ « x ] 

r[x i-> u x ] : ei JJ. Q[x h-> uj : v 1 Q[x i-> uj : e 2 JJ. A[x h-> uj : v 2 

T[x i— > v x ] : e JJ. A[x i— ► w x ] : i> 



(r). 



Remark 7.6. The following points about the (var x ) case in Dehnition 7.5 are 
worth considering: 
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• Recall from Theorem 6.9 that there will be at most one non-trivial instance 
of (var x ) in II. In fact, the only difference between Definition 7.5 and 
Definition 6.11 is that the former replaces the unique non-trivial instance 
of (var x ) for an atomic x with a trivial one, whilst the latter does not. 

• T : x JJ. guarantees that r' : x JJ. for every heap such that II contains a 
sub-derivation r' : x JJ.. Furthermore, as we will see in Lemma 7.9, in the 
unique instance of (var x ), x is atomic in such r" as well. 

• Because x G atomic(T), by Lemma 7.3, for any instance of (var x ) in II like 
(T',x \— ► _) : x JJ. (A', x i— > _) : v x , we are guaranteed that r" = A'. Note 
that this includes both trivial and non-trivial instances of (var x ). 

• For any sub-derivation II' of V : e JJ. A : v which contains no non-trivial 
instance of (var x ), H'[x i— > v x ] in Definition 7.5 and II' [x i— > v x ] in Defini- 
tion 6.11 coincide. 

Lemmas 7.7 and 7.8 are key technical results, which are needed for Theo- 
rems 7.10 and 7.11: 

Lemma 7.7. Suppose x G atomic(T) , so Y : x JJ. n:c r[x i— > v x ] : v x for some Ii x 
and v x , and diff(H x ) = {x}. 

Suppose r : y J| n A y : v, for some y other than x. Then if x G V^ILy) then 

y^v(u x ). 

Proof. We prove the contrapositive. Suppose that y G V(H X ). By assumption 
diff(H x ) = {x} and it follows by Lemma 6.5 that T(y) is a value. Therefore, 
as demonstrated in the proof of Lemma 5.4, H y consists of an instance of (lam) 
followed by an instance of (var y ). Namely, V(U y ) = {y}, and in particular, 
x <£ V(U y ). The result follows. □ 

Lemma 7.8. Suppose x G atomic(T) , so Y : x JJ-n^. T[x i— > v x ] : v x for some H x 
and v x , and diff(TL x ) = {x}. 

Suppose r : e JJ. n A : v and x G - V(n). 

Then x G atomic (A), and moreover A : x JJ. A[x i— > v x ] : v x . 

Proof. Suppose T : x Jln^ r[x i— > i>J : w x for some ^ and some Ila, such that diff(U x ) 
{x}. Suppose r : e J| n A : v and x G - V(n). 

By Lemma 6.5, T(z) is a value for every z G V(n x ) \ {x}, and A(z) = T(z) 
by Lemma 3.10. By assumption x G - ^(L 7 ), hence by Lemma 6.3 A(x) = T(x). 
So, 

T(z) = A(z) for every z e V(U X ). (7.1) 

By Theorem 6.23, it follows that there exist 11^, and A' such that A : x J|n^ A' : v x 
and diff(IL' x ) = diff(Ii x ) = {x}. Besides, 

A'(z) = T[x^v x ](z) for every z e V(U X ). (7.2) 
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From (7.1) and (7.2), we conclude 

A(z) = A'(z) for every 2 G V(U X ) \ {x} (7.3) 

A'(x) = v x . (7.4) 

On the other hand, by Lemma 6.3, 

A(z) = A'{z) for every z <£ V(U X ). (7.5) 

Therefore, A' = A[x i— > w^] as a result of (7.3), (7.4), and (7.5). □ 

Lemma 7.9. Suppose Y : x J| n ^ A : Wj. and suppose T(z) = Y'(z) for every 
z G V(H X ). Then, x G atomic(Y) implies that x G atomic(Y'). 

Proof. Suppose x is atomic in Y. By Lemma 7.3, Y : x Jj. na; F[x t— > uj : w^. 
where diff(Yl x ) = {x}. By Theorem 6.23, therefore, there exists a heap A' such 
that r" : x i^u^ A' : v x and diff(Yl' x ) = {x}. Consequently, x is atomic in Y' by 
Lemma 7.2. The result follows. □ 

Theorem 7.10 shows how to reduce a computation in a heap Y to a compu- 
tation in a heap r[iH v x ] which is in some sense 'simpler'. This simplification 
step will be used in key results to follow, including Theorem 8.4 and Lemma 9.1: 

Theorem 7.10. Suppose Y : e JJ. n A : v and x G V(n). 

Suppose x G atomic(Y) ; so in particular Y : x Jj-n^ r[n-^ v x ] : v x for some v x . 
Then A(x) = v x and 

Y[x ^ v x ] : e JJ-n[x^« x ] A : v. 

Furthermore, 

diff(Yl[x^v x ])= diff(YY)\{x} 
V(U[x^v x ])U V(U X )= V(U) 
x G V(Yl[x \-^ v x ]). 

Proof of Theorem 7.10 comes in Appendix B.l. 

Theorem 7.11. Suppose x G atomic(Y); so in particular Y : x ^yi x Y[x \—> v x ] : v x 
for some v x , and H x such that diff (Yl x ) = {x}. Suppose Y[x i— > v x ] : e J|n' A : v 
and x G V^IT). Then Y : e JJ-n A : v for a H such that Yl[x i— > v x ] = Yl' and 
xe V(Yl). 

Proof of Theorem 7.11 comes in Appendix B.2. 

Theorem 7.12. Suppose Y : e JJ-n A : v. Then exactly one of the following 
possibilities holds: 

• diff (IL) = 0. 

• There exists an x G diff (IT) such that x G atomic(Y). 
Proof. By induction on Pi based on the final rule: 
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• (lam). II takes the form 

(lam) 

T : Xx.e JJ. T : Xx.e 

in which case diff (II) = 0. 

• (var x ). II takes the form 

: n ' 

r' : e JJ. A' : v 

(var x ) 



(r', x i— ► e) : x JJ. (A', 1 h d) : jj 



where (T', x i— >■ e) = T and (A',x i— > u) = A. If diff (II') ^ then by 
inductive hypothesis there exists an x' G diff(U') such that x' G atomic(T'). 
By Lemma 7.9, x' G atomic(T), and the result follows. 

Suppose diff (IT) = 0. There are now two cases: If T(x) ^ A(x) then 
x G atomic(T). If Y{x) = A(x) then diff (II) = 0. In either case, the result 
follows. 

(app). II takes the form 



: iii : n 2 

r :ej|6 : Xz.e' 6 : e'[y/z] JJ. A : v 
T : e y JJ. A : u 



(app). 



If diff (III) 7^ 0) then, by inductive hypothesis, there exists a variable x G 
dom(T) such that x G atomic (T), as desired. 

Suppose diff (Hi) = 0. If diff(U 2 ) = 0, then diff (H) = too. Otherwise, 
note that diff (Hi) = implies T = 0. Therefore, we can use the inductive 
hypothesis to conclude that there exists a variable x G diff(H 2 ) such that 
x G atomic(Q). Hence, there exists x G diff (II) such that x G atomic(T), 
as required. 



(let). II takes the form 



:ir 



(T,x l ^e t )ti:e^(A,x l ^e' l )ti:v 

(let). 

T : \et{x i =e i } T l = i\ne JJ. A : v 

If diff (IT) = 0, then for every y G dom((T,Xi i— > e;)" =1 ), it should be the 
case that (T, £; i— » ei)™ =1 (y) = (A, Xi i— > e' i )^ = i(y). Because X{S are fresh 
(Definition 3.4), for every z G dom(T), this implies T(z) = A(z). Hence, 
diff (n) = too. 

Suppose diff (II 1 ) ^ 0. Then, by inductive hypothesis, there exists a 
variable x which is atomic in (T,Xi i— > ej)" =1 . If a; G dom(r), then, 



Chapter 7. Heaps and Atomic Variables 66 



by Lemma 7.9, x is also atomic in F, as desired. If every such x is in 
{xi, . . . ,x n }, we claim that diff (II) = and the result follows. Below we 
prove our claim: 

Suppose on the contrary that atomic((T, Xi i— > e;)" =1 ) C {xi}™ =1 and diff (U) ^ 
0. Then, there exists a variable y G diff(H) such that II' contains a deriva- 
tion S : y 4J, n + _ : v y where F C S. (See Remark 7.13.) By our freshness 
conditions (Definition 3.4), V(H y ) C dom(T). (See Remark 7.14.) There- 
fore, by Theorem 6.23, T : y JJ-n - : v y is derivable. By inductive hypothesis, 
there are two possibilities: 

— diff(H y ) = 0. In this case, by Lemma 6.10, T(y) = v y . Hence, by 
Lemma 3.10, A(x) = v y as well which contradicts y G diff(Jl). 

— There exists a z G diff(H y ) such that z G atomic(T). Then, because 
X;s are fresh, by Lemma 7.9, z G atomic((T,Xi i— ► ei)" =1 ), which given 
that z G dom(L) contradicts atomic((T,Xi i— > ei)™ =1 ) C {xi}™ =1 . 

(seq). II takes the form 



; iii • n 2 

T : ei ij, Q : Vl : e 2 ^ A : v 2 
r : ei seq e 2 J| A : w 2 



(seq). 



If diff(Ui) ^ 0, then, by inductive hypothesis, there exists a variable x G 
dom{T) such that x G atomic(T), as desired. 

Suppose diff (Jli) = 0. If diff (JI2) = 0, then diff(U) = too. Otherwise, 
note that diff(Ui) = implies L = 0. Therefore, we can use the inductive 
hypothesis to conclude that there exists a variable x G diff (II 2 ) such that 
x G atomic(Q). Hence, there exists x G diff (II) such that a; G atomic(T), 
as required. 

□ 

Remark 7.13. In words, y in the (let) case of Theorem 7.12 is the first variable 
in dom(T) which gets evaluated in IT. The graphical intuition is the bottom-most 
left-most x' for which (var x /) occurs in II'. Note that this obviously exists, or the 
contrary assumption diff(U) ^ becomes false. 

Remark 7.14. For the (let) case of Theorem 7.12, by construction (Defini- 
tion 6.1), Xi G V(Hy) implies x^ G dom(Y) or X; G fv(e!) for some F(x') = e'. 
This contradicts the freshness side conditions of (let) in our operational semantics 
(Definition 3.4). 
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Chapter 

Analogous Heaps 



We can now begin to exploit the results of Section 7. First we define a notion 
of analogous heaps T\ ~ T 2 (Definition 8.1). Think of an expression e as a 
transformation on heaps; if T : e JJ, A : v then e causes T to 'evolve' to A, and 
in doing so e calculates v. Then T\ « r 2 when for any e, e calculates the same 
value v in both heaps, though the final heaps may differ (just as in life, when 
two situations are analogous then we expect the same actions to yield the same 
results). 

In Theorem 8.4 we prove that as heaps evolve under evaluation, they remain 
analogous. Thus, heaps are different from a 'store', where the value associated 
to a variable may change arbitrarily. The proof-method is not induction on a 
derivation II; instead, we exploit atomic elements and use induction on the size 
of diff(H) as we promised at the start of Section 7. 

In fact, all that happens to the heap in evaluation is that a variable that was 
bound to an expression e, may become bound to a value — and because heaps 
remain analogous under evaluation, this value is the same as we would obtain by 
evaluating e directly in the intitial heap. This is Theorem 8.5, and Corollary 8.6 
repackages part of that theorem. 

One quite surprising consequence of all this, is that, in presence of a garbage- 
collecting (let) — like that of our operational semantics (Definition 3.4) — value 
equivalence ~„ implies heap equivalence ~fc., and so also implies strict equivalence 
~. s (Lemma 4.5 notes the reverse implication, which is essentially immediate). 
This is Lemma 8.10 and Theorem 8.11. 

Finally, in Section 8.3, we formalise this alternate view about evaluations: 
heaps can be viewed as states of a state transition system which evolve to each 
other upon evaluation of expressions. Theorem 8.19 proves that sa is a bisimilarity 
relation for such a state transition system. 

8.1 Evaluation of Heaps 

We start by introducing a notion of equivalence between heaps, which — as shown 
in this chapter — will prove extremely useful: 
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Definition 8.1. Define Ti w T 2 by: 

Ve.Vu.((3Ai.ri : e J| A x : w) O (3A 2 .r 2 : e J| A 2 : v)\ 

We call T\ and T 2 analogous. 

Lemma 8.2. w zs an equivalence relation (reflexive, transitive, symmetric) . 

Proof, ph is reflexive because for any heap T: 

Ve, v. (r : e J| A : u) O (r : e J| A : v). 

« is symmetric because for any heaps Ti and T 2 , 

Ve, v. (Tx : e J| A : v) O (r 2 : e JJ. A : w) 

implies 

Ve, v. (r 2 : e J| A : v) O (ri : e J| A : «). 

Finally, in order to show that « is transitive, fix heaps Ti, T 2 , and T3 such that 
Ti « T 2 and T 2 Ri T3. By construction (Definition 8.1) then: 

Ve,v. (ri :ej| A : v) O (r 2 : e J| A : v) (8.1) 

Ve', v'. (r 2 : e' J| A : v') O (r 3 : e' J| A : «') (8.2) 

Fix eo and i>o such that ri : eo JJ- A : «o. By (8.1), it follows that r 2 : eo JJ, A : i>o 
which by (8.2) implies r3 : eo JJ. A : i>o- The result follows by symmetry. □ 

Lemma 8.3. Suppose x G atomic(T) , so in particular T : x Jl^ T[x i— > ^J : v x 
for some Ii x and v x . 
Then Y rj Y[x \- > Uj.]. 

Proof. Suppose that r : e JL n A : d. We are supposed to show that there exists a 
heap A' such that r[i i-> ^] : e JJ, A' : v. There are two cases: 

• The case x G - F(II). By Lemma 6.15, T[x i— > v x ] : e JJ, A[x i— > w x ] : w. 
Take A' = A[a; h «J. 

• The case x G ^(n). By Theorem 7.10, T[x \—>v x ]:e tym x ^ Vm ] A : v. Take 
A = A. 

Suppose T[x i— > u x ] : e JJ.n A : w. We should show that there exists a heap A' 
such that r : e JJ, A' : v. There are two cases: 

• The case x G - V(II). Write e x = T(x). By Lemma 6.15, T : e JJ, A[x i— > 
e x ] : u. Take A' = A[xh ej. 

• The case i G V(II). By Theorem 7.11, T : e JJ, A : v. Take A' = A. D 
Theorem 8.4. IfT : e J| n A : v then T Ri A. 

Proof. By induction on the size of diff(U). There are two cases: 
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• The case diff (IT) = 0. So I = A. We use reflexivity of w (Lemma 8.2). 

• The case diff (U) ^ 0. By Theorem 7.12, there exists an x G diff (II) 
such that x G atomic (F), so in particular T : x JJ. r[x i— > uj : w x for 
some v x . By Lemma 8.3, I sa T[x i— > v x ]. By Theorem 7.10, r[x i— > v x ] : 
e JJ-n [:»->•« J A : i> (Definition 7.5). By Theorem 7.10, diff(H[x i— > uj) = 
diff (II) \{x}. By inductive hypothesis, T[x i— > t) x ] sa A. We use transitivity 
of « (Lemma 8.2). □ 

One way to view Theorem 8.4 is as a bisimulation result; if we view expressions 
as actions causing transformations of heaps, then it is easy to use Definition 8.1 
and Theorem 8.4 to show that analogous heaps can perform the same actions, 
and then evolve to analogous heaps. See Section 8.3. 

Theorem 8.5. Suppose I : e JJ. n A : v and suppose x G diff (14). Then A(x) = v x , 
where I : x JJ. A x : v x for some A x . 

Proof. Suppose x G diff (14). By Lemma 6.7, A(x) is a value, write it v x . It 
follows by Lemma 5.4 that A : x JJ. A : v x . By Theorem 8.4, I : x JJ. A^ : v x for 
some A x . □ 

Note that Theorem 8.5 only augments Corollary 6.8 by clarifying the property 
of v x that I : x JJ. A x : w^ for some A^. Theorem 8.4 is the result which makes 
this possible. 

We say a little more about Theorem 8.5: 

• If I : e JJ-n A : v then we can calculate A just from I and from knowing the 
set F(n); we do not have to know e or II. 

• Furthermore, we can do this calculation by evaluating each x G V(n) start- 
ing from I; it is not necessary to evaluate the x G V(H) in the order in 
which they are evaluated in 14 (recall that (app) and (seq) make heaps 
'evolve'). 

Corollary 8.6. Suppose I : e JL n A : v and I : e' J| E ' A' : v' . If 1/(14) = 7(11') 
then A = A'. 

Proof. By Lemma 3.7, dom(A) = dom(I) = dom(A'). So, it suffices to prove 
A(x) = A'(x) for every x G dom(I). 

Suppose that 7(11) = 7 (IT). By Corollary 6.6, diff (II) = diff (II'). Choose 
any x G 7(11). There are two cases: 

• x G diff (14) = diff (II'). By Theorem 8.5, A(x) = A'(x) = v x such that 
r : x JJ. A x : 1^ for some heap A x . 

• xg 7(n)\di]0 r (n)= 7(n')\dij0 r (n / ). By Lemma 6.5, it follows that I(x) is 
a value v x . By Lemma 3.10, then, A(x) = v x and A'(x) = v x . D 
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8.2 Value Equivalence versus Strict Equivalence 

So far, our results have used ~ s (we used ~/> once, in Theorem 4.6, and ~„ 
not at all). This is because ~ s establishes more properties than ~ v or ~^ (see 
Lemma 4.5). Conversely, ~ s is a harder proof-obligation than either ~^ or ~„. 
In this subsection we show that, unexpectedly, ~ v and ~ s are equal. 

Definition 8.7. Write 0, for the term let z = Ax.(xx) in zz. 

Lemma 8.8. There exists no A and v such that T : 0, JJ- A : v. 

Proof. Suppose on the contrary that r : 0, JJ-n A : v is derivable. Then, II takes 
the form 

: n ' 

(r, z \— > Xx.xx) : zz JJ- (r, z i— » Ax.xx) : v 



T : ft JJ. A : v 
where II' in return takes the form 



fletl 



• n" 

(lam) 

T : Xx.xx JJ- T : Xx.xx (T, z i— > Ax.xx) : zz JJ- (T, z i— > Ax.xx) : w 

(app). 

(r, z i— > Xx.xx) : zz JJ- (r, z i— > Ax.xx) : u 

Therefore, by Determinism (Theorem 3.9), II' =i e t Q II". (See Notation 3.8.) But, 
given that IT contains II", this requires II' to be infinite, which is contradictory. 

□ 

Lemma 8.9. Suppose T : e JJ-n A : v and T(x) = Q. Then x $■ V(U). 

Proof. Suppose x G diff(H). By Theorem 8.5, there exist v x , A x , and U x such 
that T : x Jln^ A x : v x and A(x) = v x . By Lemma 8.8, it follows that Y(x) ^ fi. 
Now T(x) is not a value, so by Lemma 6.5, x G V(II) \ diff(H) is impossible. 
The result follows. □ 

Lemma 8.10. Assume e\ ~„ e 2 and suppose T : e\ JJ-n x Ai : v, so that also 
T : e 2 J|n 2 A 2 : v /or some A 2 and Hi- T/ien V(IIi) = 7(II 2 ). 
As a corollary, e\ ~„ e 2 implies e\ ~fe e 2 . 

Proof. Take any x G V(IIi) \ F(II 2 ). We prove a contradiction. 

By Lemma 6.15, T[x i— > fi] : e 2 JJ-ria[x>->n] A 2 [x i— > H] : u. It follows from 
ei ~„ e 2 that T[x i— > Q,] : e x JJ-n' A' x : v for some 11^ and A^. By Lemma 8.8, 
x G - V(n'i)- Write e' = T(x). By Lemma 6.15, V : e x JJ-n' [xi-W] A'Jx i-> e'] : u, and 
F(n;[x ^ e']) = ^(n;). By Theorem 3.9, it follows that U[[x i-> e'} =, eto I^. 
(See Notation 3.8.) Hence, by Lemma 6.2, I^lTJx i— > e']) = V(IIi) — which 
contradicts our assumption that x G F(IIi). The result follows by symmetry. 

The corollary follows immediately, using Corollary 8.6. □ 

Theorem 8.11. e ~ v e' if and only if e ~ s e'. 
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Proof. The right-to-left implication is Lemma 4.5. The left-to-right implication 
is by Lemmas 8.10 and 4.5. □ 

Remark 8.12. Note that e ~fe e' does not imply value equivalence e ~ v e' . For 
example, from Corollary 9.2 it follows that x seq y ~^ y seq x. However, it is a 
fact that x seq y ^ v y seq x. 

Remark 8.13. It is noteworthy that Theorem 8.11 is only correct for Launchbury- 
based systems with garbage-collecting (let) rule. A large number of our results 
will fail without such a (let) rule. In particular, Lemma 8.10 will fail because 
Corollary 8.6 is not valid then. In a nutshell, the reason for this is that without 
a garbage-collecting (let) rule, expressiveness of heaps will grow upon evaluation 
of expressions. See Section 11.3 for further remarks. 



8.3 Heap Bisimilarity 



Although we have so far treated heaps as media for evaluation of expressions, 
an alternate view is also plausible. Heaps can be viewed as states of a state 
transition system which evolve to each other upon evaluation of expressions. 
Interestingly enough, m is a bisimilarity relation for such a system. Lemma 8.17 
and Theorem 8.19 formalise this. 

We start from the definition of a bisimulation relation. 

Definition 8.14. Given a labelled state transition system (S, A, -^), a bisimu- 
lation relation is a binary relation H^ over S when: 

For every p,q G S such that p %. q, and for all a G A 

1. V eS.(p^ p') => 3q' G S. (9 A q') A (p' %. q'). And, 

2. Vq' eS.(g^> g') =* 3p' G S. (p A p') A (p' %. q'). 

Call the largest bisimulation relation a bisimilarity. 

Remark 8.15. The famous Tarski-Knaster Theorem guarantees the the existence 
of the largest bisimulation relation. See [ )] and [32], for example, for proof and 
more. 

Definition 8.16. LetTC, £, andV be the set of all heaps, expressions, and values, 

(e,v) (.,.) 

respectively. Define -^ for the transition system £7% = (TC,£ x V, — —>) such that 

Li { -^l F 2 when Li : e ^ F 2 : v. 
Lemma 8.17. ~ is a bimulation for ^^. 

Proof. Fix Ti and T 2 for which Ti w F 2 and Ti — '—> A\ for some heap Ai. If we 

show that there exists a heap A 2 such that F 2 -^ A 2 and Ai ~ A 2 , the result 
follows by symmetry. 
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By construction (Definition 8.16), Y\ ——> Ai means Ti : e JJ-ni Ai : v for some 
IIi. Given that Li « r 2 , by construction (Definition 8.1), it follows that there 
exists a heap A 2 such that r 2 : e JJ. A 2 : u. In order to show Aj ~ A 2 , suppose 
Ai : e' JJ-n' A^ : i>'. We can use IT and IIj to conclude T 1 : e seq e' JJ. A^ : u': 



i Hi ; n; 

r x : e J| Ax : v A 1 : e' J| Ai : «' 
Ti : e seq e' J| A[ : u' 



(seq). 



Because Ti « T 2 , (by Dehnition 8.1), it follows that there exists a heap A' 2 such 
that T 2 : e seq e' JJ- A 2 : 7/ which takes the form 



r 2 : e J| H : v E : e' J| A' 2 : u' 

(seq). 

T 2 : e seq e' JJ- A 2 : ?/ 

By Determinism (Theorem 3.9), r 2 : e JJ. A 2 : v and r 2 : e JJ. H : v imply S = A 2 . 
Hence, A 2 : e' JJ- A 2 : u', namely, Ai ~ A 2 as desired. □ 

Remark 8.18. In the proof of Lemma 8.17, after establishing the existence of A 2 , 
we could have concluded the desired result in the following way: by Theorem 8.4, 
T\ ~ Ai and T 2 « A 2 . The result follows by transitivity of m (Corollary 8.2). 

Although this latter proof is correct and even shorter, we still preferred to use 
the former for Lemma 8.17. This is to show that Lemma 8.17 is independent of 
Theorem 8.4. Furthermore, the former proof emphasises the fact that seq can be 
viewed as two transition steps for &H (rather than a single one). 

Theorem 8.19. ~ is a bisimilarity for 27^. 

Proof. Suppose ^ is a bisimulation for 27j{. It is straightforward to prove %. C pa, 
and the result is immediate: hx Y\ and F 2 such that Y\$S?,- Given that %^ 

is a bisimulation for £^, by construction (Definition 8.16): Ti — '—> A\ implies 

F 2 — ^ A 2 for some heap A 2 (such that Ax^A 2 ). In other words, Ti : e JJ. Aj : v 
implies r 2 : e JJ. A 2 : v for some heap A 2 (such that Ai^A 2 ). By symmetry, it 
follows that fi » T 2 . D 
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Chapter 

Left-Commutativity 



This chapter shows that the semantics of expressions can be preserved, even when 
the evaluation of certain subexpressions are reordered using selective strictness. 
The main result is the left-commutativity of seq (Theorem 9.3): that (ei seq 
e 2 ) seq e% ~ s (e 2 seq ei) seq e%. By Theorem 4.6, it suffices to prove Corollary 9.2, 
that ei seq e 2 ~h e 2 seq e\. Theorems 8.4 and 8.5 from Chapter 8 play a 
particularly useful role in the proofs of this chapter. 

A key insight is a non-trivial compositionality result on the difference sets 
induced by e x seq e 2 (Lemma 9.1). Suppose V : e x seq e 2 JJ-n A : v. So by 
the semantics for seq, T : e\ Jlrii : V\ and : e 2 J|n 2 A : t>. In words, ei is 
evaluated in T and then e 2 is evaluated in 0, a heap analogous to but not equal 
to the initial heap T. Now, e\ seq e 2 causes T to evolve by evaluating expressions 
associated with variables in diff(U), and this is equal to diff(Tli) U diff(U 2 ). We 
will prove the non-trivial fact that diff(U) is equal to diff(Hx) U diff(U' 2 ), where 
r : e 2 JJ-n' A 2 : v 2 . The proof is not by induction on II, but by induction on 
diff(U) using the results of Chapter 7. 

Once we have the compositionality result, e\ seq e 2 ~h e 2 seq e\ (Corol- 
lary 9.2) follows by Theorem 8.5, and left-commutativity (Theorem 9.3) is imme- 
diate by Theorem 4.6. 

Lemma 9.1. Suppose T : e\ Jlrii Ai : V\ and T : e 2 i^u 2 A 2 : v 2 . Then, for some 
II and A, 

T : ei seq e 2 J| n A : u 2 , 
d^(n) = diffiUx) U rf^(n 2 ), and 

7(n)= 7(ilJu i/(n 2 ). 

Proof. We work by induction on the size of diff(Ui). 

Suppose dijJ(Ui) = 0. So, Ai = T, and we construct II as follows: 



; II! ; n 2 

r : ei 4 T : v x Y : e 2 ^ A 2 : v 2 

T : ei seq e 2 J| A 2 : w 2 



(seq). 



Clearly, di#(n) = di#(II 2 ), so d#(n) = ^(IT) U d»tf(II 2 ). Also, V(U) 

v(n x )Lj v(u 2 ). 
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Suppose diff (IF) ^ 0. By Theorem 7.12, there exists a z G diff (Hi) such 
that z G atomic (T), so in particular T : z J|n i r[zi-> v z ] : w z for some H z and w z . 
By Theorem 7.10, Y[z \— > « z ] : ej JJ-ni [zi-niJ Ai : w i where 

diff(Ui[z^v z ]) = diff(Ui)\{z}, (9.1) 

^(IFfz .-> wj) U V(IL Z ) = 7(ILJ, and, (9.2) 

zG 7(IIi[;z <-►«*]). (9.3) 

There are two cases here: 

• z G F(n 2 ). Then, by Theorem 7.10, T[z i— > i) z ] : e 2 •H-n 2 [z>->-^] A 2 : w 2 where 

dt#(n 2 [z ^ «J) = diff(U 2 ) \ {z}, and, (9.4) 

V(U 2 [z .-> v z ]) U V(U Z ) = V(U 2 ). (9.5) 

• z G - F(n 2 ). Then, by Lemma 6.15, 

r[z ^ u z ] : e 2 ^n a [*>-«,] A 2 [z h^ v z ] : w 2 
(II 2 [2: i— ► w z ] in the sense of Definition 6.11) where 

diff(U 2 [z .-> v z }) = diff(IL 2 ) = diff(U 2 ) \ {z}, and, (9.6) 

V(Il 2 [z^v z ])=V(Il 2 ). (9.7) 

In either case, by inductive hypothesis, there exist II' and A' such that 

Y[z i-> v z ] : ei seq e 2 JJ-n' A' : u 2 , (9.8) 

difCn') = difCni^w,]) U diff(IL 2 [z^v z }), and, (9.9) 

7(n') = V(TLi[z .-> v z }) U V(Il 2 [z .-> v z }). (9.10) 

By substituting (9.1) and (9.4) (or (9.6)) into (9.9), we get 

diff(W) = (diff (III) U diff(U 2 )) \ {z}. (9.11) 

Furthermore, (9.3) and (9.10) imply z G V(U'). So, by Theorem 7.11, T : ei seq 
e 2 JJ-n A' : v 2 for some II such that II[jz i — > v z ] = IT 7 . By Theorem 7.10, 

diff(U) = diff(W)U{z}, and, (9.12) 

7(H')U V(U Z ) = 7(11). (9.13) 

Finally, by substituting (9.11) into (9.12), we get diff (II) = diff (Ui) U diff(U 2 ). 
Additionally, as required, V(H) = V(Ui) U V r (II 2 ) follows from (9.13) and the 
fact that z e 7(11). □ 



Corollary 9.2. e x seq e 2 ~ fe e 2 seq e 



i- 
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Proof. Suppose F : e\ seq e 2 JJ-n A : i> 2 . We will show that F : e 2 seq ei JJ- A : V\ 
for some V\\ the result will then follow by symmetry. 
II takes the form 

; ni ; n 2 

r : ei jj : v x 6 : e 2 jj, A : v 2 

— (seq). 



T : ei seq e 2 J| A : v 2 

By Theorem 8.4, r « 9. Since : e 2 JJ- A : w 2 , T : e 2 J| n ; ©' : « 2 for some 9'. 
Also, by Theorem 8.4, 0' ~ T. Therefore, 0' : e\ JJ-n' A' : v\ for some A'. We 
combine Il 2 and 11^ to construct a derivation IF of T : e 2 seq e\ JJ- A' : V\. 

By Lemma 9.1, V{Tl) = V{W) = 7(n x ) U F(II 2 ) and dtjf (II) = dtff (IF) = 
diff(Ui) U rfz/f (II 2 ). We now reason by cases: 

• Suppose z G diff(U). By Theorem 8.5, A(z) = « z = A'(z), such that 
T : 2 JJ- _ : v z . 

• Suppose z G dom(T) \ diff(U). A(z) = T(z) = A'(z) by Lemma 6.3. 

The result follows. □ 

Theorem 9.3 (Left-commutativity). 
(ei seq e 2 ) seq e 3 ~ s (e 2 seq e x ) seq e 3 . 

y4s a corollary, e\ seq (e 2 seq e 3 ) ~ s e 2 seq (ei seq e 3 ). 

Proof. From Corollary 9.2 and Theorem 4.6. 

The corollary follows by associativity (Theorem 5.1). □ 
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Chapter 

Idempotence 



This chapter proves a general form of idempotence: e seq e ~ s e, extending the 
more restricted Theorem 5.5 (x seq x ~ s x). Given the results we have proved 
since then, the general version is not hard. First, we need a very natural property: 
once an expression has been evaluated, evaluating it again in the resulting heap 
has no further effect on that heap. This is Theorem 10.1. This chapter borrows 
heavily from [71]. 

Theorem 10.1. IfT : e J| A : v then A : e JJ. A : v. 

Proof. Suppose T : e JJ. n A : v. We work by induction on the size of diff(H). 
There are two cases: 

• The case diff '(H) = 0. Then, by definition (Definition 6.4), T = A and 
we are done. 

• The case diff (II) ^ 0. By Theorem 7.12, there exists an x G diff '(H) such 
that x G atomic(T), so in particular T : x JJ. T[x h- > v x ] : v x for some v x . By 
Lemma 6.7 x G V(n), so by Theorem 7.10, T[x i— > v x ] : e ^nfan-*^] A : v 
and diff(H[x \—> v x ]) = diff (H) \ {x}. (See Definition 7.5.) By inductive 
hypothesis, A : e JJ, A : v. D 

The result follows. 

Theorem 10.2 (Idempotence). e seq e ~ s e. 

Proof. Suppose that T : e seq e JJ-n A : v. Then II must have the form: 



; III ; n 2 

r : e^6 :v 6:e^A:v 



(seq). 



T : e seq e JJ, A : v 

By Theorem 10.1, 6 : e JJ. 6 : v. By Theorem 3.9, 6 = A. The result follows. □ 

Finally, we exploit associativity and idempotence (Theorems 5.1 and 10.2) to 
obtain a nice corollary: 
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Corollary 10.3. The following equivalences are correct: 

1. e x seq (e x seq e 2 ) ~ s e x seq e 2 

2. (ei seq e 2 ) seq e 2 ~ s e a seq e 2 

Proof. By associativity (Theorem 5.1) it suffices to prove 

(ei seq e x ) seq e 2 ~ s e : seq e 2 and 
e x seq (e 2 seq e 2 ) ~ s e : seq e 2 . 

This follows by idempotence (Theorem 10.2) and Theorem 4.7. □ 
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Chapter 

Conclusion and Future Work 



In this chapter, we first present an overall summary of the research in this the- 
sis (Section 11.1). We next provide a big-picture explanation about our proof 
techniques in Section 11.2. Finally, in Section 11.3, we end this chapter with a 
discussion on the limitations of this thesis, as well as possible directions for future 
work. 

11.1 Summary 

Selective strictness is a widely used, but potentially dangerous technique in call- 
by-need languages. The equivalences we prove in this thesis are new, as are the 
proof-methods. 

This thesis begins by reviewing the related work (Chapter 2). We then start 
our technical development by considering three operational notions of program 
equivalence (Chapter 3). Each equivalence is based on observing a different part 
of the program's behaviour: the value returned (~ v ), the final heap (~h), or both 
(~ s ) (Chapter 4). Subsequent chapters use the strongest equivalence ~ s , but as 
it turns out, because of the garbage-collecting (let) rule, ~„ coincides with ~ s 
(Section 8.2). 

In Chapter 5, we make two developments: We first prove Associativity of seq 
(Theorem 5.1) and Idempotence of seq for variables (Theorem 5.5). Next, we 
discuss a subtlety of the operational semantics of seq (Section 5.2). 

Proving the Left-Commutativity of seq is far more challenging, as it entails 
proving equivalences between expressions where the order of evaluation of subex- 
pressions has been changed in arbitrary ways. Chapters 6, 7, and 8 provide 
interesting insights while assembling the necessary machinery. 

We find it important to distinguish between non-trivial uses of a variable that 
cause evaluation, and hence change the heap, and those trivial uses that do not. 
We identify the difference set of non-trivial variables in a derivation II as diff(U). 
Finally, we show that expressions are indeed evaluated at most once (Chapter 6). 

We develop some theory of a dependency relation between variables, imposed 
by heaps. We identify a particular special case of atomic variables, which are least 
in the dependency order, and which represent units of evaluation that update 
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exactly one variable in the heap. Theorem 7.12 is a key result showing that for 
any II there must be some atomic variable in diff (II) (Section 7). 

By viewing expressions as heap transformers, we dehne a notion of analogous 
heaps, and show that as heaps evolve under evaluation, they remain analogous 
(Chapter 8). This gives rise to a notion of bisimilarity between heaps (Sec- 
tion 8.3). 

The results from Chapters 6 to 8 are applied to show that the the call-by- 
need semantics of expressions can be preserved, even when the evaluation of 
subexpressions are reordered using selective strictness. The main results are Left- 
Commutativity and Idempotence of seq (Theorems 9.3 and 10.2, respectively). 
Key insights are a non-trivial compositionality result on the difference sets, the 
commutativity of seq under heap equivalence (Chapter 9), and idempotence for 
expressions considered as heap-transformers (Theorem 10.1). 

11.2 Proof Techniques 

We now comment on aspects of our proof-techniques: 

Because evaluation of subexpressions may be reordered in particularly com- 
plex ways by selective strictness, straightforward inductive principles seem to 
fail; instead we develop the basic properties of diff (II) (Chapter 6) and then in 
three central technical results, Theorems 7.10, 7.11, and 7.12, we lay the ground- 
work for simple and compact inductions found in Theorem 8.4, Lemma 9.1, and 
Theorem 10.1. These, in turn, make possible the proofs of our main results in 
Chapters 8, 9 and 10. 

In the course of these proofs, an interesting and subtle interplay becomes 
apparent between heaps and derivations: given a heap T and an expression e, we 
may be able to form a derivation II for evaluating e in T (which by Theorem 3.9 is 
unique up to fresh naming, if it exists). In II we can observe where the expression 
bound to a given variable x is updated (if at all); this occurs at the unique (by 
Theorem 6.9) non-trivial instance of (var x ). However, it is a fact that this order 
of update must be a refinement of the dependency order imposed by the heap 
T. Thus, in some sense, information flows from a heap V to the structure of 
any derivations built using it. On the other hand, it is not a priori obvious 
that the dependency order provides any useful information, as the example of 
I\ = (x \— > y, y i— > x) in Chapter 7 showed; no derivation can exist using x and 
y in this heap. However, Theorem 7.12 guarantees that if a derivation n does 
exist, then T must provide at least one atomic element (least in the dependency 
order). In this sense, information also flows back from derivations to heaps. We 
can see this in our technical results; for example in Theorems 7.10 and 7.11 
information flows from the heap to derivations, whereas in Theorem 7.12 we 
start from the derivation and deduce properties of the heap. The influence of 
this interplay extends beyond the proofs, to our most basic definitions. Thus, we 
do not define x to be atomic with respect to a derivation II; we do not write in 
Definition 7.2 that x is 'atomic' when diff(U x ) = {x} for 11^ the subderivation of 
II above a non-trivial instance of (var x ) in II. If we did, then proofs would fail, 
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because transformations of II may change the order of evaluation and enlarge the 
subderivation such that diff(H x ) becomes non-empty. 

11.3 Limitations and Future Work 

As exemplified in Remark 3.5, the side condition Xi ^ fv (v) for our garbage- 
collecting (let) rule causes arguably reasonable lazy programs to fail to reduce in 
our system. Here is an example of the situations which motivated us for having 
this side condition. Let us suppose, for the sake of argument, that we had removed 
the Xi ^ fv (v) side condition from our (let) rule. Suppose also the following 



= let X2 = Xx.x in \x.x2, 



e" 

e' = e" x\, 



e = let x\ = Xx.x in e', 
Ti = {xi I— > Xx.x}, and, 
T 12 = T\ U {x 2 1— > Xx.x}. 

Then, : e ijf. Suppose otherwise, namely, suppose that : e JJ.. We illustrate 
the problem by considering the derivation. If : e JJ. was derivable in our system, 
it would have taken the form 

(lam) 

r 12 : Xx.x 2 JJ. Ti2 : Xx.x 2 ; rr 

(let)* • n 

Ti : e" J| Y x : Xx.x 2 Y x : x 2 JJ. _ : _ 

(app) 



I\ : e' J| _ : _ 

(let! 



: e JJ, _: _ 

(Note that (let)* is not a valid instance of (let) because x 2 G fv(Xx.x 2 ).) However, 
n cannot exist because x 2 is not bound in r l5 and therefore, there is no rule to 
apply. In fact, although x 2 was bound in T 12 , the (erroneous) garbage-collecting 
(let)* rule takes it back. This is whilst x 2 is still hiding behind a A-abstraction 
(in Xx.x 2 ). 

It is natural to question the necessity of garbage-collection here. This question 
becomes more appropriate when one realises that real-world Haskell program- 
mers might use programs such as e or e" in practice. Interestingly enough, for 
non-lazy programming languages, it is arguable that programs like e" must be 
banned because they are letting x 2 out of its scope. However, this is not a com- 
pletely reasonable argument in presence of lazy evaluation. The reason is that, in 
lazy evaluation, we never know when bindings are going to be used. Therefore, 
whilst scope-violation errors should be caught using other syntactic scrutinies, 
the let-bindings should remain in the heap until they are finally used, if ever. 

Whilst it might be possible to defend programs like e or e" this way, there is 
no reason why an operational semantics should allow a program like e'" = (let x = 
Xy.y in Xz.z) seq x. Unfortunately, any non-garbage-collecting (let) rule would 
cause programs such as e'" to reduce successfully. The reason is that, as also 
mentioned in Remark 8.13, the expressiveness of heaps will unreasonably grow 
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without garbage-collection. For example, the binding x \— > Xy.y will unreasonably 
remain in the heap when the evaluation of let x = Xy.y in Xz.z is complete. 
Thus, the operational semantics has no way to stop evaluation of x afterwards 
whilst x has no meaning from that point on (namely, after the evaluation of 
let x = Xy.y in Xz.z). 

One might argue that catching errors such as the one in e'" is not a task for 
the operational semantics. Although this is partly correct, one needs to note 
that if the operational semantics fails to catch such errors, then pointless equiva- 
lences such as e 1 " ~„ Xz.z will arise. In other words, e 1 " would be a valid citizen 
of language as far as the operational semantics and observational equivalence 
are concerned. This can well foster numerous mysterious implementation bugs 
for code optimisers which typically work in the total trust of the soundness of 
observational equivalence (s) provided. 

Several possibilities are worth studying for retaining garbage-collection whilst 
avoiding the problems mentioned above. Dr. Gabbay suggested extending the 
definition of values so that it also contains let-bindings of the form let {x t = 
e i}i'=i ' n Ax.e. This would be similar to the syntactic category "Answers" in [10]. 
On the other hand, the student suggested a pre-normalisation of the form: 

(let {xi = e;}™ =1 in Xx.e) y -w let {x { = e;}™ =1 in ((Xx.e) y). 

(Note that this is not new to the literature. Ariola et al. [10, 7, 62] all have 
similar rules. See rules (let-C), and (C) rules in Figures 2.6 and 2.8, respectively, 
as well as lift let in [7].) Both these suggestion try to put every application into 
its context whilst they retain garbage-collection. Whether or not any of these 
possibilities, or a combination of them, will fix the problem is a subject for future 
work. 

As another possibility for future work, one can investigate how to extend our 
proof-techniques to reason about selective strictness in conjunction with parallel 
evaluation. This is motivated by languages like GpH [103] and Eden [ '] that 
combine selective strictness and parallelism. For example GpH uses both sequen- 
tial and parallel compositions, seq and par, where par evaluates both arguments 
in parallel. 
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Appendix 

Additional Semantic Property Proofs 



This chapter contains the proofs we postponed in Chapter 6. This includes 
Lemma 6.15 and Lemma 6.17, which are presented in A.l and A. 2, respectivey. 

A.l Lemma 6.15 

Lemma 6.15. Suppose L : e JJ-n A : v. Suppose x G dom(U) but x (fc V(H). 
Then for any e x , 

T[x^e x ] : e JJ-n^ej A[x^e x ] : v 

diff(U[x^e x ]) = diff(U) 

V(U[x^e x }) = V(U). 

Proof. We inductively transform II into II [x i— > e x ]. We reason by cases on the 
final rule in II: 

• (lam). II takes the form 

(lam) 

L : Xx.e JJ. L : Xx.e 

and H[x i— > ej takes the form 

(lam). 

r[x i— > e x ] : Xx.e JJ. L[x i— > e x ] : Ax.e 

The result is straightforward because V(H) = V(U[x i— > e x \) = and 
diff{Ii) = diff{Ii[x^e x }) = 0. 

• (var y ) for y other than x. H takes the form 



: n ' 

r' : e y ^ A' : u„ 



(r', y ^ e y ) : y J| (A', i/hi;,,): w. 
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(var y ) 
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where (T',y i— » e y ) = L and (A',y i— > v y ) = A. Given that dom(T') = 
dom(T)\{y}, x G dom(Y) implies x G dom(V). So, by inductive hypothesis 

T'[x ^ e x ] : e y JJ-ri'^,-^] A' [re ^ e x ] : v y 

diff(W[x^e x ]) = diff(W) 

V(W[x^e x })= V(W). 

Given that y (£ dom(T'), by Definition 6.1, V(n') = V(U) \ {y}, in par- 
ticular, x i 7 (IT). Similarly, by Definition 6.4, diff(W) = diff(IL) \ {y} 
and 



; W[x ^ e x ] 
T'[x i-> e x ] : e y JJ. A' [re i-> e x ] : w y 



(var y ). 



(L'[x ^ e x ],y ^ e y ) :y^ (A' [a; ^ eJ,i/H «„) : «„ 

The result follows by noting that, by Lemma 6.14, (T'[x i— > ej,y i— > e^) = 
(r',y i-> e s )[rr h^ e x ] and (A' [re i-> ej , y i-» ^) = (A',y i-> ^)[x h^ ej. 

(var x ). This case is out of consideration because x G - V(n). 

(app). II takes the form 



; II! ; n 2 

r :ej|0 : Xz.e' Q : e'[y/z] ij, A : v 

T : e y JJ. A : v 



[appj 



lappj 



rr ^ V(H) implies x ^ V^(TIi) and x £ l/(il 2 ). On the other hand, by 
Lemma 3.7, x G dom(T) implies x G dom(Q). So, by inductive hypothesis, 
T[x ^ e x ] : e tyu^e*] Q[x h-> ej : Az.e' and 0[x i-> e x ] : e'[y/z] ^n 2 [a*-^] 
A[i h ej : d, Thus 

: IIi[rr i-> e x ] ; n 2 [x h^ e x ] 

T[x r ej : e 4 0[x i— > ej : Az.e' 0[x i— > e x ] : e'[y/z] JJ. A[x i— > e x ] : u 

r[rr i— > e^] : e y JJ. A [x i— > e x ] : v 
Clearly, this derivation is H[x \—> e x \. By inductive hypothesis: 

diff(U 1 [x^e x ]) = dijf(U l ) 
V(U 1 [x^e x })= V(IIi) 

and 

^#(n 2 [x ^ e j) = dij0F(n 2 ) 

V(U 2 [x^e x })= 7(n 2 ). 
The result follows by noting that 

F(II[x .-» ej) = V^II^x .-» ej) U F(n 2 [x -> e x }) 
diff(U[x i-> ej) = ^(ITfx i-> e x ]) U d*jF(II 2 [a: .-> ej). 
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• (let). II takes the form 

: n ' 

(r, Xi .-> eJti : e ^ (A, a* h-> e'^ =1 : v 

(let). 

T : \et{xi=ei}i =1 \ne JJ. A : v 

By inductive hypothesis 

(T,Xi ^ ei)™ =l [x ^ e x ] : e ij-w[x^e x ] (A,£; ^ ^)™ =1 [x ^ e x ] : v 
which, by Lemma 6.14, implies 

(r[x i-> e x ],Xi ^ ej)" =1 : e $n>[x»e x ] (A[i !-► ej,^ i-> e-)™ =1 : v. 
Therefore 

; n'[x i-> e x ] 

(T[x ^ e x ],Xi ^ e { yi =l : e J| (A[x ^ e x ], x { ^ e-)™ =1 : v 

(let) 

T[x i— > ej : let {xi=ei\™ =l in e JJ, A[x i— > e^] : f 

which is obviously II[x i— > e x \. Furthermore, by inductive hypothesis 

diff(W[x^e x ]) = diff(W) 
V(W[x^e x })= V(W). 



The result follows by noting that 



V(Ii[x .-> e x }) = V(Ii'[x .-> e x }) \ {x^ =1 
diff(U[x .-> e x ]) = diff(W[x » e x ]) \ { Xi }U- 



D 



A. 2 Lemma 6.17 

Lemma 6.17. Suppose that x ^ dom(T) and Y : e J| n A : v. Then, (T,x 
e x) '■ e U-11+ (A,£ l_ ^ e x) '■ v where 

n = n + | d0TO ( r ) 

7(n+) = 7(n) 

diff(IL+) = diff{Ii). 



Proof. For this proof, without any loss of generality we rename the let-bound 
variables in II to avoid accidental clash with x. We proceed by induction on II 
based on its hnal rule: 
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• (lam). II takes the form 

(lam) 

T : Xy.e J| V : Xy.e 

whilst II + takes the form 

(lam). 

(T,x ^ e x ) : Xy.e JJ. (T,x i-> e x ) : Xy.e 

The result is immediate by noticing that V(Jl) = V(H + ) = diff (II) = 
diff(Ii+) = 0. 

• (var y ) for y other than x. H takes the form 

: n ' 

r' : e y i A' : v y 

( var y) 



(r', y ^ e y ) : y JJ, (A', y ^ v y ) : v y 



where (T',y \— ► e y ) = V and (A',y i— > v y ) = A. Given that x (fc dom(T'), 
we can apply the inductive hypothesis to conclude (T',x i— > e x ) : e y JJ. n '+ 
(A', x i— > e x ) : Wj,. Therefore, II + is derivable with the form 



: n ' + 

(r', x ^ e x ) : e y JJ. (A', i h e,) : u 
(r, x h^ e x ) : y JJ. (A, x i-> e x ) : u. 



(var y ). 



F(I1 + ) = V(H) follows by inductive hypothesis because V(H) = V(H') U 
{y} and V(Ii + ) = 7QT+) U {y}. By inductive hypothesis, diff(Tl' + ) = 
diff (II 1 ). On the other hand, 

- if y G diff (II), then diff (II) = diff(W)U{y} and diff (11+) = diff(W+)U 

-\iyi diff (II), then dijf(n) = diff (II') and di#(II+) = dtjf(II'+). 
Both cases imply diff(U + ) = diff(H), as desired. 
(app). II takes the form 

\ Hi \ n 2 

r : e J| : Az.e' : e'[y/z] J| A : v 
(app). 

T : e y JJ. A : u 

By inductive hypothesis, (T, x i— > e x ) : e JJ. n + (0,x i— > e x ) : Xz.e' and 
(9,x i— > e x ) : e'\y/z\ JJ- n + (A, x i— > e x ) : v where 

iii = n x \dom(r) n 2 = n 2 |d om (o) 



2 



v(n+) = 7(n x ) 7(n+) = 7(n 

dif (n+) = d^(no ^#(n+) = d#(n 
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Therefore, II + is derivable with the form 



: n ^ : n ^ 

(r, x h^ e x ) : e JJ- (9, x h^ e x ) : Xz.e' (9, x *-> e x ) : e'[y/z] -(J- (A, x ^ e x ) : v 
(app). 

(r, x h^ e x ) : e y JJ, (A, x ^ e x ) : v 

Note that V(U + ) = V(U+)U V(U+), diff(U + ) = diff (n+) U diff (n+) , and, 
by Lemma 3.7, dom(I) = dom(Q). The result is immediate. 

(let). IT takes the form 



: n ' 

(r, Xi i-> efe : e j), (A, x, h-> e^Li ■ « 
T : \et{x i =e i } 7 l =l \r\e J| A : w 



let) 



Given that dom(I) = dom((I,Xi \—> ej)™ =1 ) \ {x i }'^ =l , the assumption 
x ^ dom(r) implies x ^ dom((T,Xi i— > e,)" =1 ). By inductive hypothesis, 
(r, Xi h^ e;, x h^ e x )7 =1 : e JJ- n '+ (A, Xj i-> e-, x h^ e x )f =1 : v where 

n = n \dom((r,xi^ ei )Y =1 ) 

V(W+) = V(W) 

diff(n'+) = diff (ii'). 

Thus, II + is derivable with the form 

■IL'+ 
(r, Xi ^ e h x ^ e x )i =1 : e J| (A, x t ^ e[, x ^ e x )" =1 : w 

(r, x^ e x ) : let {x;=e;}™ =1 in e JJ. (A, x h^ e*) : w 
The result follows by noting that 



(let). 



dom(I) = dom((I,Xi ^ ei)? =1 ) \ {x;}™ =1 

dom((I, x ^ e x )) = dom((I, x { ^ e h x ^ e x )™ =1 ) \ {x;}™ =1 

V{H) = v{m)\{ Xi }u 

7(n+) = F(n'+)\{x J }- =1 

d#(n) =^(no\{x^ =1 

dtf(n+) =^(n'+)\{x i }? =1 . 

Note that by the renaming policy taken for this proof, name clash between 
x and Xi's is now impossible. 

(seq). II takes the form 



; iii • n 2 

r : ei i : v x B : e 2 ij- A : v 2 
I : ei seq e 2 J| A : i> 2 



(seq). 
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By Lemma 3.7, x G dom(T) implies x G dom(Q). So, by inductive hypoth- 
esis, (r, x i— > e x ) : ei J| n + (9, x \—> e x ) : v\ and (0, x i— > e x ) : e2 JJ- n + (A, x i— > 
e s ) : «2 where 

iii = n 1 |d 0m (r) n 2 = n 2 |d OTO (o) 

7(n+) = 7(n x ) 7(n+) = 7(n 2 ) 

d^(n+) = ^(ilj *jfl F (n+) = ^#(n 2 ). 

Therefore, n + is derivable with the form 



n+ : n 



(r, x h^ e x ) : e 1 JJ- (8, mej: v x (6, x h^ e^) : e 2 ^ (A, a; h^ e x ) : v 2 

(seq). 

(r, x h^ e x ) : e x seq e 2 JJ- (A, x ^ e x ) : v 2 

Note that V(U+) = V(U+)U V(U+), diff(U + ) = diff (n+) U diff (n+) , and, 
by Lemma 3.7, dom(T) = dom(Q). The result is immediate. 

□ 



B 



Appendix 

Additional Heap Proofs 



This chapter contains the proofs we postponed in Chapter 7. This includes The- 
orem 7.10, and Theorem 7.11, which are presented in Sections B.l and B.2, 
respectivey. 

B.l Theorem 7.10 

Theorem 7.10. Suppose T : e J| n A : v and x G V(U). 

Suppose x G atomic(T); so in particular T : x ^n^ r[iH» v x ] : v x for some v x . 
Then A(x) = v x and 

V[x h+v m ]:e ^n[»-H;,] A : v. 

Furthermore, 

diff(Il[x»v x ]) = diff(IL)\{x} 

V(U[x .-> v x \) U V(U X ) = V(U) 
x G V(n[rc i-> uj). 

Proof. Induction on II based on its final rule: 

• (lam). II takes the form 

(lam). 

T : Xy.e $ T : Xy.e 

There is nothing to consider for this case because x (fc V(n). (In fact, 
7(11) = 0.) 

• (var x ). II takes the form 

r' : e x ^ A' : v x 

(var x ) 



(r', x ^ e x ) : x JJ. (A', x ^ v x ) : v a 
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where (V, x \— > e x ) = T and (A', x \—> v x ) = A. (Note that, as also notihed 
in Remark 7.6, A' = r" here.) By construction (Definition 7.5), in this case, 
II [x i— ► v x ] will take the form 

(lam) 



r' : v x J| V : v x 

(var x ). 



(r', x ^ v x ) : x JJ. (r', x h^ -y^) 

Observe that, by determinism of our operational semantics (Theorem 3.9), 
II =iet« n x . (See Notation 3.8 for =i e t Q -) Thus, x G atomic(T) implies 
diff (IV) = diff(Jl x ) = {x} by Lemma 7.3, and 

= diff(U[x ^v x ]) = diff (n) \ {x} 
V(Il[x .-> u,]) U 7(11*) = {x} U 7(11) = V(TV) 

because by assumption x G 7(11). Finally, note that x G dom((T', x \—> v x )) 
and Il[ii-4 v x ] contains an instance of (var x ). By definition (Definition 6.1), 
these mean that x G 7(II[x i— ► uj). 

(var y ) for y other than x. H takes the form 



■ w 

r' : e y J| A' : «„ 



(r', y !-»■ e„) : y JJ- (A', y ^ « ) : v t 



(var y ) 



where F = (T',y i— > e^) and A = (A',y i— ► i> y ). By assumption x G 7(11) 
which implies x G 7 (II'). It follows by Lemma 7.7 that y ^ 7(IL C ). Thus, 
by Lemma 7.9, x G atomic(T') and F' : x JJ. T'[x i— > u x ] : v x . Namely, 
il'[x i— > v x ] satisfies all the properties given in the lemma. By inductive 
hypothesis, then A'(x) = v x and Y'[x \— > v x ] : e y -^n'^i-^] A' : v y . Hence, 
A(x) = v x and 

(r'[x ^ v x ], y^e y ):y J|n>-^ x ] A : v y . (B.l) 

By Lemma 6.14, 

(r'[x ^ v x ],y ^ e y ) = (T',y ^ e y )[x ^ v x ]. (B.2) 

We use (B.l) and (B.2) to observe that T[x \— ► v x ] : y JJ-nfei-^^i A : v y . 

diff(U[x^v x ]) = diff(U)\{x} 

v(u[x^v x ])u v(u x )= 7(n) 

follow by inductive hypothesis and 

7(n)= 7(n')u{y} 

diff (U) = diff (W)u{y}. 
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(app). II takes the form: 

\ iii \ n 2 

r : e' $ 9 : \z.e" 6 : e"[y/z] $ A : v 
T : e' y ^ A : w 

There are now two cases: 



(app). 



— The case that x G F(IIi). By inductive hypothesis 0(x) = v x and 
T[x i— > u x ] : e' JJ-nifei-vu.,,] © : Xz.e". By Lemma 3.10, A(x) = O(x) = w x . 
Given that 9(x) = v x , by Lemma 6.13, n 2 [x i— > v x ] = U 2 . Thus, by 
the (app) case of Definition 7.5, we can combine II i [a; i— > v x ] with n 2 
to get n[x i— > v x ] and observe that T[x i— > i>J : e'y ^-n^i-*.^] A : i). 
Finally, by definition (Definition 6.1), x G F(IIi[x i— > v x ]), implies 
x G F(n[x i— ► Da;]). The result follows. 

- The case that x G - V(IIi). It follows that x G V(Il 2 ). By Lemma 6.15, 
T[x h^ v x ] : e' J|m [*.-►«„] 0[x h^ v x ] : Az.e". By Lemma 7.8, x G 
atomic(Q), and 9 : x JJ. X : v x for some X . Therefore, by inductive 
hypothesis A(x) = v x and 6[x ^ v x ] : e"[y/z] JJ-n 2 [x^« a; ] A : w. We 
combine Ili[x i— > u x ] with II 2 [x i— > v x ] to get II [x i— > ii x ] and observe 
that T[x i— > v x ] : e'y i^uix^vj A : v. (Note that, by the (app) case of 
Definition 7.5, this combination is indeed Il[x i— ► v x ].) Finally, by defi- 
nition (Definition 6.1), x G F(II 2 [x i— > v x ]), implies x G F(II[x i— > v x ]). 
The result follows. 

In both cases, by inductive hypothesis, 

V(Ii 1 [x^v x ])UV(Il x )=V(Il 1 ) (B.3) 

V(Ii 2 [x .-> v x }) u 7(n x ) = 7(n 2 ) (B.4) 

d*tf(iii[x .-> ««]) = difcno \ {x} (b.5) 

d^(n 2 [x .-> u x ]) = diff(U 2 ) \ {x}. (B.6) 

On the other hand, by Definitions 6.1 and 6.4, respectively, 

V(U)=V(U 1 )UV(U 2 ) (B.7) 

diff(U) = diff(U t ) U diff(U 2 ). (B.8) 

By substituting (B.3) and (B.4) into (B.7), we get 

v(ii[x^v x })u 7(n x )= i/(n). 

Similarly, by substituting (B.5) and (B.6) into (B.8), we get 

diff(U[x^v x ]) = diff(U)\{x}, 
as desired. 
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• (let). II takes the form 

\w 

(let). 

T : let{xj=ei}" =1 ine JJ. A : v 

Observe first that x G atomic(T) implies x G atomic((T,Xi \— > ej)™ =1 ) by 
Lemma 7.9. Secondly, x G V(H) implies x G V(U'). Therefore, by induc- 
tive hypothesis: 

- (T,Xi !-► e;)" =1 [x i-> uj : e Jk'^^] (A, ij i-> eQ? =1 : u which, by 
Lemma 6.14, can be rewritten as (T[x i— > i^Xi i— > e i )^ =1 : e ^n'fxi-*^] 
(A,Xi i— > e^JLi : u - Extending with (let), we deduce r[a? i— > uj : 
let {xi=ei}™ =1 ine JJ-nfrci-i-uJ A : v. By the (let) case of Definition 7.5, 
we know that this combination is indeed II[rc i — > v x ], as desired. 

— (A, Xi i— > e'j)" = i(x) = ^a; which, given that a^'s are fresh, implies A(rc) = 
Given that {xi}™ =1 n dom(r) = 0, by Definitions 6.1 and 6.4, respectively: 

y(n) = 7(n')\{^}? =1 (b.9) 

V(U[x^v x ])=V (W [x^v x ])\{ Xi }? =1 (B . 10) 

diff(Tl) = diff(Tl')\{x i }^ 1 (B.ll) 

dt#(n[z .-> uj) = difCn'b .-> «j) \ {x^ =l . (B.12) 

On the other hand, by inductive hypothesis, 

7(n'[iH«j)u7(n I ) = 7(n') (B.13) 

diff(W[x ^v x ]) = diff(W) \ {x}. (B.14) 

By substituting (B.9) and (B.10) into (B.13), we get 

V(IL[x h- v x }) U V(Jl x ) = v(n). 
Similarly, by substituting (B.ll) and (B.12) into (B.14), we get 

diff(W[x^v x ]) = dijJ(W)\{x}, 
as desired. 

• (seq). II takes the form 



: n x : n 2 

T:eiie-.v l Q : e 2 ij- A : v 2 
r : ei seq e 2 JJ- A : w 2 
There are now two cases: 



(seq). 
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— The case that re G F(IIi). By inductive hypothesis 0(rc) = v x and 
T[x ^ v x ] : ei JJ-mia^ej © : «i- By Lemma 3.10, A (re) = 9(rr) = v x . 
Given that 0(re) = v x , by Lemma 6.13, Il2[rr i— > v x ] = II2. Thus, by the 
(app) case of Definition 7.5, we can combine ILfre 1— > v x ] with n 2 to 
get II[rc 1— > wj and observe that r[re 1— > v x ] : ei seq e 2 JJ-n^i-^] A : v 2 . 
Finally, by definition (Definition 6.1), x G V(IIi[x 1— > v x ]), implies 
x G F(Il[rc 1— > v x ]). The result follows. 

— The case that x G - F(rii). It follows that x G F(n 2 ). By Lemma 6.15, 
r[rr i-> uj : e a Jln^i-^] B[rr i-> uj : t^. By Lemma 7.8, re G 
atomic(Q), and : rr JJ. X : v x for some X . Therefore, by inductive 
hypothesis A (re) = v x and Q[x \— > wj : e 2 JJ-n 2 [xi-hjJ A : t> 2 . We com- 
bine IIi[rc 1— > v x ] with n 2 [rc 1— > ^] to get LT[rr 1 — > w x ] and observe that 
T[rc 1— > w x ] : ei seq e 2 JJ-nfrci-i-uJ A : w 2 . (Note that, by the (app) case of 
Definition 7.5, this combination is indeed II[re 1— ► v x ].) Finally, by defi- 
nition (Definition 6.1), x G F(II 2 [re 1— > v x ]), implies x G V(II[x 1— > v x \). 
The result follows. 

In both cases, by inductive hypothesis, 

V(n 1 [x^v x ])UV(Il x )=V(Il 1 ) (B.15) 

V(Il 2 [x .-> v x }) U V(U X ) = V(Jl 2 ) (B.16) 

diff{H x [x .-> v x \) = diff{H x ) \ {x} (B.17) 

diff (Ii 2 [x .-> v x \) = diff(Ii 2 ) \ {re}. (B.18) 

On the other hand, by Definitions 6.1 and 6.4, respectively, 

v{u) = 7(n a )u v(a 2 ) (b.19) 

dijJ(U) = dijJ(U 1 )Udijf(U 2 ). (B.20) 

By substituting (B.15) and (B.16) into (B.19), we get 

V(Il[x^v x ])U V(Jl x )= V(p). 
Similarly, by substituting (B.17) and (B.18) into (B.20), we get 

diff(U[x^v x ]) = diff(U)\{x}, 
as desired. 

D 



B.2 Theorem 7.11 

Theorem 7.11. Suppose x G atomic(T); so in particular T : x J|n x r[rc i— > v x ] : 
v x for some v x , and 11^ such that diff(H x ) = {re}. Suppose T[x i— >■ v J : e J|n' A : i> 
and re G V(n'). Then T : e J,Ln A : v for a II such that II [re i— > v x ] = U' and 
reG V(Tl). 
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Proof. Write e x = Y(x). We work by induction on II' based on its final rule: 

• (lam). IT 7 takes the form 

(lam). 

T : Xy.e J| Y : Xy.e 

There is nothing to consider for this case because x ^ ^(n 7 ). (In fact, 
V{W) = 0.) 

• (var x ). II' takes the form 

(lam) 

r' : v x $ V : v x 

(var x ) 



(r', x i-> v x ) : x JJ. (r', x\-+v x ): v x 

where Y = (T',x \—> v x ) = A. We take II = U x . By Definition 7.5, as 
x G atomic(Y), we have Tl x [x \— > v x ] = IT. 

(var y ) for y other than x. IT 7 takes the form 

V'[x ^ v x ] : e y JJ, A' : v y 

(var y ) 



(V, y ^ e y ) [x ^ v x ] : y JJ, (A', y ^ v y ) : D, 



where Y = (Y',y \—> e y ) and A = (A',y i— > v y ). By assumption x G ^(IT). 
It follows by Lemma 7.7 that y (fc V(H X ). Therefore, by Lemma 7.9, x G 
atomic(Y'), and Y' : x JJ H : d^ for some heap H. By inductive hypothesis, 

r' : e y JJn y A' : v y for some Yi y such that n,, [x i— > d x ] = II' and (B.21) 
x G V(n„). (B.22) 

We extend 11^ with a (var y ) to get II. By (B.21), we observe that Y : y JJ n 
A : v y such that II[a; i— > v x ] = IT 7 . By definition (Definition 6.1), given that 
x G dom(Y) fl dom(T'), (B.22) implies x G F(II), as desired. 

(app). II' takes the form 



: n; ; ir 2 

r[x ^ v x ] : e' ij, Q : Xz.e" : e"[y/z] JJ A : v 
Y[x i— > Da;] : e'y JJ. A : v 
There are now two cases: 



(app). 



The case that x G V^IL^). By inductive hypothesis Y : e' JJ ni : 
Xz.e" and IIi[a; i— > v x ] = H[. Furthermore, by inductive hypothesis, 
x G F(IIi), which by Corollary 6.8, implies 0(x) is a value. Therefore, 
by Lemma 6.13, H' 2 [x i— > v x ] = H 2 . Hence, by the (app) case of 
Definition 7.5, we can combine IT with U' 2 to construct a II such that 
T : e'y JJ n A : d and U[x i— > v x ] = IT. The result follows because, by 
definition (Definition 6.1), x G V(IIi) implies x G V(II), as desired. 
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The case that x Viji'-^. It follows that x G V(W 2 ). By Lemma 6.15, 
T : e' JJ-n' [an-^-ea.] ©[^ h ^ e J : Xz.e". By Lemma 7.8, x G afomic(0[x i— > 
e x \). By inductive hypothesis, 0[x h e x ] : e"[y/z] J|n 2 A : w for a 
n 2 such that n 2 [x i— > Wj.] = il 2 and x G F(II 2 ). By construction (the 
(app) case of Definition 7.5), we can combine n' x [x i— > e x ] and il 2 to 
get a derivation II such that II [x i— > w x ] = II'. The result follows be- 
cause, by definition (Definition 6.1), x G F(II 2 ) implies x G V(II), as 
desired. 



(let). IT takes the form 

(T[x i-> v x ],Xi ^ e^ =l : e J|(A,X; ^ e[)™ =l : « 



: n; 



(let). 



T[x i— > wj : let {x i =e i }™ =1 in e JJ. A : v 

Observe that, by Lemma 6.14, (T[x i— > v x ],Xi i— > e;)™ =1 = (F,x, \—> ei)™ =l [x \—> 
v x }. Thus, (T,Xi ^ ei) n i=l \x ^ v x ] : e JJ. n ; (A,x; h^ e^f =1 : u. On the 
other hand, given that x;'s are fresh, by Lemma 7.9, x G atomic(T) implies 
x G atomic((T,Xi \-> ei)™ =l ). Therefore, by inductive hypothesis, (T, Xi i— > 
e i)T=i : e ^-n, (A,Xj i— » e^JLj : v for some II; such that rL[x i— > v x ] = !![. We 
extend 11; with (let) to get II such that T : let {x i =e i }" =1 ine J| n A : v and 
II [x i-> v x ] = IT. 

(seq). IT takes the form 

: n; ; n 2 

T[x i— > v x ] : ex JJ. : V\ : e 2 JJ. A : v 2 

(seq). 

T[x i-> v x ] : ei seq e 2 JJ. A : v 2 

There are now two cases: 

— The case that x G 1^(11^). By inductive hypothesis T : e x J| ni : V\ 
and Ui[x i— > v x ] = H[. Furthermore, by inductive hypothesis, x G 
F(rii), which by Corollary 6.8, implies 0(x) is a value. Therefore, 
by Lemma 6.13, II 2 [x i— > v x ] = il 2 . Hence, by the (seq) case of Def- 
inition 7.5, we can combine IT with U 2 to construct a II such that 
r : &\ seq e 2 J| n A : v 2 and II [x i— > v x ] = II'. The result follows 
because, by definition (Definition 6.1), x G V(IIi) implies x G V(II), 
as desired. 

— The case that x G - V^II^). It follows that x G F(IT 2 ). By Lemma 6.15, 
r : e\ JJ-n' [x^ej ®[ x l_ ^ e x] '■ v i- By Lemma 7.8, x G atomic(Q[x i— > 
e x \). By inductive hypothesis, Q[x i— > ej : e 2 JJ-n 2 A : w 2 for a II 2 such 
that II 2 [x i— > v x ] = II 2 and x G F(II 2 ). By construction (the (seq) 
case of Definition 7.5), we can combine H[[x i— ► e x ] and II 2 to get a 
derivation II such that II [x i— > uj = IT. The result follows because, by 
definition (Definition 6.1), x G V^(II 2 ) implies x G V(II), as desired. 
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